a) Create a scatterplot of the data with the response variable on the y-axis. Based on your scatterplot, decide whether to use SLR to explore the relationship between the number of cases and days...

One question with multiple parts in the "Homework Document" file. Need to use "covid19_canada.csv" file to answer question. Need R script and answer script.



a) Create a scatterplot of the data with the response variable on the y-axis. Based on your scatterplot, decide whether to use SLR to explore the relationship between the number of cases and days since 10th case data. Justify your response. b) [R script] Fit an SLR model to the dataset, and call it ‘model1’. Plot the fitted line on top of the scatterplot from (b) using the command abline(model1). Comment on the fit of your model to the data. c) [R script] Make a QQ-plot of the residuals to check the normality assumption. Comment on your findings. d) [R script] While not ideal, there are ways to work with data sets such as these. Note the shape of your data. Apply a log-transformation (base ‘e’ is fine, but if you prefer to use base ‘10’ (or any other base), please indicate it in your script) on the response variable and call this new variable ‘log.y’. Make a new scatterplot of your transformed data. e) [R script] Fit an SLR model to your transformed dataset and call it ‘model2’. Plot the fitted line on top of your scatterplot in (e) and comment on the fit. f) R script] Perform a complete residual analysis on model 2. Include your residual plots in your analysis. Is SLR appropriate for this transformed dataset? g) State the estimated model (be careful of notation and pay attention to the variables in your model!). What does the estimated model say about the potential association between the number of cases and the number of days since the 10th case? i.e. Interpret the coefficients of model 2. h) The data only goes as far as March 21, 2020 which is roughly one week since the beginning of social distancing. For all intents and purposes, this data shows the progression of the number of cases without any intervention. Use your model to predict the expected number of cases, nationally, for Sunday April 12, 2020 and compare it with the most recently available numbers (either Saturday April 11 or Sunday April 12). Based on your model, does it appear that social distancing is having an effect in curbing the number of cases in Canada? Comment on the validity of your prediction. i) Using what you have explored in this dataset, describe how you would be able to recognize whether a dataset should be log-transformed before seeing if a SLR model would be appropriate. Describe how you would decide whether to transform the response variable or the explanatory variable. (This is a thinking problem. You are expected to carefully consider why a log-transformation was useful in this problem so you can apply this to future scenarios.)
Apr 11, 2021
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here