STAT 4385 Spring, 2020 Applied Regression Analysis Midterm Exam III Instructions: This is an open-book and open-notes take-home exam that is expected to complete within the designated time frame....

1 answer below »
the format of this assignment is filling the blank


STAT 4385 Spring, 2020 Applied Regression Analysis Midterm Exam III Instructions: This is an open-book and open-notes take-home exam that is expected to complete within the designated time frame. Please pay attention to symbols such as 0 . These correspond exactly to the answers that you need to submit via Blackboard. When you are done, submit your answer sheet (by scanning or taking a picture of it via your smart phone) via Blackboard by 2:00 pm. If you cannot print out the answer sheet, you can easily make your own, but please clearly specify the blanks from 1 to 33. In case that you do not have access to Blackboard for technical reasons, emailing me your answer sheet at [email protected] before the due time is also fine. Please name your file as YourLastName-ExamIII.PDF (or other extension name) if possible. Whenever necessary, apply the significance level α = 0.05 in hypothesis testing problems. 1. Rehabilitation Therapy A rehabilitation center researcher was interested in examining the relation- ship between physical fitness (X) prior to surgery of persons undergoing corrective knee surgery and time (Y ) required in physical therapy until successful rehabilitation . Patient records in the rehabilita- tion center were examined, and 24 male subjects ranging in age from 18 to 30 years who had undergone similar corrective knee surgery during the past year were selected for the study. The number of days required for successful completion of physical therapy and prior fitness status (below average, average, and above average) for each patient was recorded. The collected data are presented below, together with some summary statistics. It can be found that the overall mean and sample standard deviation (SD) are ȳ = 32 and sy = 6.878. Fitness Below Average Average Above Average 29 40 30 31 26 22 42 30 35 29 32 38 42 39 35 21 40 28 29 20 43 31 33 23 Sample Size (nk) 8 10 6 Sample Mean (ȳk) 38 32 24 Sample SD (sk) 5.477 3.464 4.427 In order to compare the mean days Y ) required in physical therapy until successful rehabilitation among the three physical fitness groups (X), one-way ANOVA is performed via the regression approach. To do so, two dummy variables are introduced to account for fitness groups zi1 = { 1 if the i-th subject is at the average itness level. 0 otherwise zi2 = { 1 if the i-th subject is below average in fitness. 0 otherwise 1 mailto:[email protected] Then we consider the following model yi = β0 + β1 zi1 + β2 zi2 + εi with εi IID∼ N (0, σ2). (1) The fitting results are tabulated below. Table of Parameter Estimates parameter estimate s.e. t p-value β0 1 1.817 13.208 0.000 β1 2 2.298 4 0.010 β2 3 2.404 5.824 0.000 ANOVA Table Source df SS MS F Value Model 5 8 11 16.962 Error 6 9 12 Total 7 10 Based on the above information, answer the following questions: (a) Complete the above tables by filling the blanks. (b) With the dummy variable coding scheme, which level (below, average, or above) of fitness status (X) will be used as the baseline or reference for comparison? 13 : . (c) The ANOVA F test is formally performed to assess the effect of fitness status on the number of days required in physical therapy until successful rehabilitation. To this end, i. Express the null hypothesis of no effect in terms of βj ’s, i.e., 14 H0 : . ii. The observed F test statistic is 15 Fobs : . iii. With significance level α = 0.05, find the critical value 16 : and draw conclusion by selecting a choice below: (a) Reject H0 at the significance level α = 0.05. (b) Cannot reject H0 at the significance level α = 0.05. (d) Compute the coefficient of determination R2 and interpret. 17 R2 = , representing the percentage of . (e) Suppose that the researcher is particularly interested in comparing the above average and below average groups. Let µabove and µbelow denote the mean number of days required in physical therapy until successful rehabilitation for all individuals whose fitness statuses are above and below average, respectively. On basis of Model (1), perform a formal test to see if µbelow is significantly higher than µabove. i. Express the null hypothesis in terms of βj ’s, i.e., 18 H0 : . ii. The observed test statistic is 19 tobs : . iii. With significance level α = 0.05, find the critical value 20 : and draw conclusion by selecting a choice below: (a) Reject H0 at the significance level α = 0.05. (b) Cannot reject H0 at the significance level α = 0.05. 2 2. We consider a marketing study where the aim is to predict monthly sales (Y ) (in million dollars) of a retail company based on advertising budget (in thousand dollars) invested in youtube (X1), facebook (X2) and newspaper (X3). The data consists of observations on these variables in consecutive n = 200 months. A few data lines are tabulated below. Month Youtube X1 Facebook (X2) Newspaper (X3) Sales (Y ) 1 276.12 45.36 83.04 26.52 2 53.40 47.16 54.12 12.48 3 20.64 55.08 83.16 11.16 ... ... ... ... 198 212.40 11.16 7.68 15.36 199 340.32 50.40 79.44 30.60 200 278.52 10.32 10.44 16.08 Suppose that we analyze the data with multiple linear regression. A number of candidate models are fit and the sum of squared error (SSE) or residual sum of squares are obtained, as tabulated below. Model Form SSE AIC BIC I y = β0 + β1x1 + β2x2 + β3x3 + ε 801.828 21 25 II y = β0 + β1x1 + β2x2 + ε 801.956 22 26 III y = β0 + β1x1 + β2x2 + β3x1x2 + ε 251.256 23 27 IV y = β0 + β1x1 + β2x2 + β3x3 + β3x1x2 + β4x1x3 + β5x2x3 + ε 244.699 24 28 (a) To determine the ‘best’ model, compute the AIC (Akaike information criterion) and BIC (Bayesian information criterion) for each model. (b) According to AIC, which model is the best? 29 : . According to BIC, which model is the best? . (c) Suppose that one decides to use Model II as our final model anyways for its simple form. Model II provides the following fitted equation: ŷ = β̂0 + β̂0x1 + β̂2x2 = 3.505 + 0.046x1 + 0.188x2 Referring to the data lines in the table, it can be seen that Month 1 (the very first row) has x1 = 276.12 and x2 = 45.36. Associated with the design matrix X of Model II, the leverage of Month 1 is found to be h1 = 0.014. Based on all the available information given above, answer the following questions: i. Obtain the mean square error (MSE) of Model II: 30 MSE = σ̂2 = . ii. Compute the fitted value ŷ1 for Month 1 with Model II: 31 ŷ1 = . iii. Obtain the studentized jackknife residual for Month 1: 32 r(−1) = . iv. Compute the Cook’s distance for Month 1: 33 d1 = . 3
Answered Same DayApr 29, 2021

Answer To: STAT 4385 Spring, 2020 Applied Regression Analysis Midterm Exam III Instructions: This is an...

Sourav answered on Apr 30 2021
136 Votes
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 24.000 1.817
13.208 1.22e-11 ***
z2 8.000 2.298 3.481 0.00223 **
z3 14.000 2.404 5.824 8.81e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 4.451 on 21 degrees of...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here