HW#3 Data Analysis The data set (file: StatData.csv) for this HW is from a survey of students enrolled in Introduction to Statistics class in Fall 2018. The variables are: • Gender: female, male, and...

See the uploaded files. The instruction is in Data Analysis.pdf


HW#3 Data Analysis The data set (file: StatData.csv) for this HW is from a survey of students enrolled in Introduction to Statistics class in Fall 2018. The variables are: • Gender: female, male, and non-binary. • Age • VideoGames: Number of hours spent playing video games during the last week • Bedtime: In order to make the data not disjoint, values above 12 correspond to morning times. For example, Bedtime of 13 is 1am. • Sleep: Hours of sleep last night • Programming Experience: Relevant experience for the class, in which we used R. The scale used was from 1-5, 1= no programming experience, 5= used R before. We’ll only be looking at VideoGames and Bedtime for this assignment. Suppose we hypothesize that students who play more VideoGames have a later Bedtime, and we want use this data to investigate. 1. Look at a scatterplot of Bedtime vs VideoGames. Does it look like there’s much of a relationship? Also look at correlation. 2. As done in lecture code for 10/18 (file: 2019_10_18-Outliers), fit the model for predicting Bedtime from VideoGames. We’re more interested in identifying unusual points than the model itself, so get the hat matrix and find the observation with the largest value of h. What is unusual about this observation? What is the value of h? Look at the observations corresponding to the 6 largest values of h. (The order(), tail(), and head() functions might be useful here.) Compare to the plot you made and identify the points with the largest values of h. 3. Calculate the studentized residuals for all the points. Find the 6 observations with the largest absolute values of the studentized residuals. What are unusual about these points? Do the hypothesis test as done in lecture code for 10/18 (file: 2019_10_18-Outliers) based on the studentized residuals. Do the test with and without the Bonferroni correction. Comment on the results. 4. Calculate Cook’s distance for each point (see bottom of page 276 of the text, file: Chapter 11.pdf). Locate the six points on the scatter plot that have the largest values of Cook’s distance. Are these different than those identified in steps 2 and 3? How does this fit in with intuition behind what each of these is measuring? 5. Does it seem like the unusual points found in parts 2, 3, and 4 should be discarded? There’s not a right or wrong answer, you might look over the discussion in chapter 11.5 (file: Chapter 11.pdf) and think about the data. Do you think the students were honest? Do you think all of the points represent typical days/weeks for the students? If no to either of those questions, that might be a reason to discard the points if we’re actually interested in studying the relationship between VideoGames and Bedtime.
Oct 27, 2021
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here