# Please read all the sheet of the excel file then do the quiz. You have solve 5 sheets. Data set also in the sheet. and last sheet has some hints.

Linear Regression 20% Use Excel to run a linear regression on the Quiz 2 Dataset tab using donor_weight as the independent variable. Save the regression results on a new sheet in this submission. 1What is the regression equation? 2What is the model's Adjusted R-Squared? 3What would the model predict for a 30 year old male 62 inches tall? 4How many terms are NOT significant at a 10% critical value? 5If we increased the number of observations, coefficients' p-values would? 6Beyond coefficient significance, identify two other concerns with these results? 7 8What is the model's Sum of Squares Residual? 9What is its Sum of Squares Total? With linear and logistic regression, we convert qualitative variables into indicators 10then we remove one of the indicators from the model to avoid what? Regression Issues (20%) What issue does each assess or address (2pts each)? 11Plot #1 12Plot #2 13Plot #3 14Plot #4 15Variance Inflation Factor 16The diagonal of a Hat Matrix 17Information Criterion 18Outliers 19Regularization 20Principal Components PCA & ANOVA 20% Perform a Principal Components Analysis on the Quiz 2 Dataset tab. (You'll need to make it a CSV file.)If you're using SAS EM, What proportion of variance is explained by Upload the file to Enterprise Miner https://www.youtube.com/watch?v=nd1otR42ARs 21 the first component? Connect the dataset to the Pincipal Components tool on the Modify tab 22 the second component? From the results, find the Eigenvalues of Correlation Matrix report 23 the third component? If you're using python, 24How many pricipal components can this dataset have? use "from sklearn.decomposition import PCA" 25A plot show the components explained variance in declining order is called a what? with n_components = 4 The prior visit count for blood donors was sampled 13 times from each ethnicity with these results >>>>>>>>Prior Visit Counts Use the Data Analysis in Excel to run a Single Factor ANOVA.Not Hispanic/LatinoHispanic or LatinoPrefer not to answer Save the regression results on a new sheet in this submission.1204 26What is the Sum of Squares between groups?15168 27Is mean Prior Visit Count is the same across Ethnicity with a 20% critical value?1179 28What proportion of Total Sum of Squares comes from between groups?1932 29What is the variance of the Hispanic or Latino sample?2610 30What is the mean Prior Visit Count within the Not Hispanic/Latino sample?5002 6572 6171 1938 34172 308 30172 102 Logistic Regression 20% A model predicting a Titanic passengers' survival is summarized on the right.Optimization terminated successfully. Based on this summary report . . . Current function value: 0.488873 31What is the linear equation that would feed into the sigmoid function?(just the first few terms is fine) Iterations 6 32Which variable has a potentially insignificant coefficient? Logit Regression Results 33What proportion of the survival variance is explained by the model?============================================================================== 34How much would the odds of survival rise by having FirstClass ?Dep. Variable: Survived No. Observations: 714 35How many times did the logistic attempt to estimate these coefficients?Model: Logit Df Residuals: 707 36How many passengers are in this dataset?Method: MLE Df Model: 6 37Which link was used to estimate this model?Date: Mon, 16 Oct 2023 Pseudo R-squ.: 0.2762 Time: 22:00:13 Log-Likelihood: -349.06 According to this model . . .converged: True LL-Null: -482.26 38Are older people more or less likely to survive?Covariance Type: nonrobust LLR p-value: 1.274e-54 39Are men more or less likely to survive?============================================================================== 40Are people who paid more for their ticket more or less likely to survive? coef std err z P>|z| [0.025 0.975] ------------------------------------------------------------------------------ Age -0.0088 0.006 -1.456 0.145 -0.021 0.003 SibSp -0.1918 0.115 -1.670 0.095 -0.417 0.033 Parch 0.0090 0.114 0.079 0.937 -0.215 0.233 Fare 0.0070 0.003 2.236 0.025 0.001 0.013 FirstClass 1.5842 0.305 5.195 0.000 0.987 2.182 Male -2.2398 0.199 -11.269 0.000 -2.629 -1.850 EmbarkedS 0.6597 0.210 3.148 0.002 0.249 1.071 ============================================================================== Confusion Tables & Misc 20% From the confusion matrix on the right, calculate . . .Actual 41AccuracyInnocentGuilty 42RecallJudgedAcquitted24881 43PrecisionConvicted42209 44Type I Error Rate 45Type II Error Rate 46Specificity 47F1 48A logistic's dataset's target is 97% negative. What issue needs to be addressed? 49What method is used to predict a students grade (A, B, C, etc)? 50With more than 4 classes, One vs One classifiers are less complex than One vs All. Quiz 2 Dataset donor_weightdonor_heightdonor_agedonor_male 20970310 17372551 21063530 16966511 23064160 16064320 17164340 11563660 19462541 13367380 28875201 12062420 17567461 20970660 28875291 17963490 17164480 12062580 18873661 18873371 22870531 22364450 26674551 15567620 19462941 15367220 13261590 15561750 18468650 16165531 13367420 28875161 22870161 18468780 16765521 22364230 26674161 13261320 17466480 12962400 20970590 17561590 26674341 16765181 19462171 21063520 16265600 18468650 17865200 17567451 23064400 22870491 16561760 16165481 22374681 18071240 26670241 19462491 17567401 17567501 22364780 22364610 16266601 26674571 26674531 14864590 10769750 17567331 12062330 16765161 17466580 16966391 21565500 17567791 16966461 16765611 18468660 22571801 13261650 10769410 18862580 13367370 21564810 16966711 13261590 22870441 14264461 18568341 19664420 22164731 22571441 17370601 16470571 22164181 14264621 16765711 16765721 16966471 17264220 16765541 16765431 13764340 17374511 10769520 28875231 17567591 16966571 18872631 21063750 16765641 17466570 14368470 12062530 26674511 15367810 23363460 15766600 13261470 16165161 16165691 16266641 13764190 28875491 25464190 15367550 12964260 17567611 22164611 24973680 10769630 18468270 22870591 14368171 18873211 22571421 12668621 13261260 17561180 16765771 16561630 26674301 16266351 17466580 23064500 13261210 18468180 18468420 18468240 21063550 17567221 16561680 18071620 17164670 14368590 13863450 23468751 15859390 17567621 18468740 13261700 17567401 22870181 17561580 14368720 13568361 11960440 28875181 10769170 18872641 26670391 15964460 18063420 18468620 17466540 17164170 17371531 20970540 21063490 17370691 16765401 19664400 19072551 22374401 17963620 13261430 22164321 17466360 23870251 17567441 20970220 18468730 18873201 17567721 21063590 17567171 21063190 12264480 12062310 13261480 17466630 20970620 12062660 19462421 17567501 13261620 13261180 18071560 14264521 17567821 12062660 26674281 16765511 16966361 16966541 12062630 10769610 14368510 22364270 15066380 22364330 28875411 16165531 18063580 22164361 10769630 19462401 18873541 22164551 17374171 16272161 15367570 20970540 22164571 20970660 14368600 12964500 16272211 16463210 16561340 28875611 13367630 16266311 18873481 16765691 11960670 21063470 17466160 18468530 21565260 18071570 19071161 21063320 13367580 15766530 17567321 11563750 15766310 26674171 23468361 16165261 12668571 16368340 17567551 19462571 18873441 16266671 23262290 14368451 16272691 26468391 22164491 11960580 17164320 18071170 17466610 19462651 15066510 20970690 22164521 21063720 21063650 16165661 16561550 20873221 16265170 16765691 16272351 12668591 18071320 16165591 21272301 17466620 10769390 28875271 20873611 10769830 13261650 16966631 18873171 17466680 12668161 17370771 21063380 16368410 28875761 26674191 17372681 17164610 14362350 16165671 14464380 17374401 16966421 16165411 20970490 26674641 15367630 17164540 21063710 21063260 16966661 22364660 18873491 17561280 15859210 17164470 19462341 14561661 12165170 18468480 18468620 17567521 20970620 21063270 17567341 16765451 19072741 18468300 14368620 17561340 18468170 16165511 15367170 20970660 16265440 17567481 11960170 17164610 21063580 18872661 13261340 13261280 17164650 17567581 16561630 22364480 19462641 12668381 28875611 18468230 16368450 13261710 22364290 17370551 16165341 17466360 18862280 12062170 15466160 26674431 14264481 12062410 19462591 18468280 16266271 14864670 13261410 18267271 22364740 19072391 17164660 23468161 26674661 18873491 12668231 16966231 14864460 14264451 16266501 16966551 22164271 17567621 16165671 13261170 22164521 13261160 10666160 17164520 13261160 22870391 13367470 16368630 13367630 28875691 22374621 17561790 11960170 14368570 22374371 11960600 14464500 13367690 12668581 15367280 16165671 22870621 26468751 17466340 13261480 17466300 17264530 20970340 15367510 18873491 16463490 13367500 24362540 16765251 17561660 12062580 22374821 19664300 26674421 23064710 12165430 21063220 18873361 17466730 13367400 22571181 14864170 10769380 17371341 17567461 17466290 18873571 17370591 19462531 19664790 10769530 18873271 18468660 18468710 22164391 17567571 17567621 13261360 13367210 18468580 13261520 18468430 19462171 12165330 17567431 19462191 16765691 18862230 18872491 16368550 28875541 12062620 16165571 21564360 17370441 17865350 13261160 13764580 16266571 17567661 16966401 13568501 19462641 18372191 16967240 21063170 21063580 22364630 15766160 17466580 26674411 22870321 20970400 17561510 16966601 18468450 10769780 19462171 13261680 11960250 14264601 Lists Independent columnsTRUEIncreasePowerLogMore Likely MulticollinearityFALSEDecreaseSignificanceInteractionLess Likely Linear relationshipsRemain the sameJust as Likely Normal distributions Regularization Influential columns Influential rows Comparing Models Heteroscedasiticy Overfitting
