G. Umphrey STAT*2120: Probability and Statistics for Engineers Winter 2021 Assignment #6 Part A is an “individual” part and Part B is a “group project” part; these two parts will have separate...

1 answer below »
file attached


G. Umphrey STAT*2120: Probability and Statistics for Engineers Winter 2021 Assignment #6 Part A is an “individual” part and Part B is a “group project” part; these two parts will have separate Crowdmark templates. For Part B, only one student from each group submits the solutions to Crowdmark. Further instructions and some hints will be provided in class. This assignment uses data sets from Hand et al.’s A Handbook of Small Data Sets. The data sets for the first two questions are posted on CourseLink. The data set for question 4 is not posted. Part A (Individual Work Only) 1. Refer to the data on “Abrasion Loss”. Hand et al. describe the data set as follows: “The data come from an experiment to investigate how the resistance of rubber to abrasion is affected by the hardness of the rubber and its tensile strength. Each of 30 samples of rubber was tested for hardness (in degrees Shore; the larger the number, the harder the rubber) and for tensile strength (measured in kg per square centimetre), and was then subjected to steady abrasion for a fixed time. The weight loss due to abrasion was measured in grams per hour.” To summarize, the data set has one response variable, Abrasion Loss (units: g/h), and two predictor variables, Hardness (units: degrees S) and Tensile Strength (units: kg/cm2). Most parts of this question will focus on simple linear regression and correlation analysis. A bonus part will provide a glimpse into multiple regression analysis. (a) Obtain the correlation coefficients between Abrasion Loss and each of Hardness and Tensile Strength. Based on these values, briefly explain why Hardness is likely a better predictor of Abrasion Loss than Tensile Strength is. (b) Obtain scatterplots of Abrasion Loss on Hardness and Abrasion Loss on Tensile Strength. Remember to label your axes properly, and include titles on the graphs. (c) Use R to obtain the simple linear regression equations that represent the least squares “lines of best fit” for the data in each scatterplot. Report these equations. (d) Fit the lines given by your equations in (c) to the appropriate scatterplots. You will hand these graphs in. (e) State an appropriate model for statistical inference. Your model will consist of an equation with an error term and distributional assumptions. Use â0 to represent the population Y- intercept and â1 to represent the population slope. (f) Under standard assumptions that you should have stated in (e), obtain individual 95% confidence intervals for the population Y-intercept â0 and the population slope â1. (g) When you used the lm() procedure to regress Abrasion Loss on Hardness, two t-test test statistic values are obtained using the summary() function. Which one tests for a linear association between Abrasion Loss and Hardness? State the null and alternative hypotheses being tested in terms of the appropriate model parameter. What is the p-value associated with this test, and what does it allow you to conclude at the 5% significance level? (h) As in (g), conduct a t-test that tests for a linear association between Abrasion Loss and Tensile Strength at the 5% significance level. (i) By default, R conducts the test in (g) as a two-sided test. If a person was a researcher in this area it is very likely that the test could have been conducted as a one-sided test. Briefly, what would be the rationale for performing a one-sided test, and if the (reasonable) one-sided test was conducted, what would the p-value be? (j) A test for a linear association between Abrasion Loss and Hardness can also be conducted using an ANOVA F test. The F test statistic value can be found on the summary() output. It can also be found on the anova() output that gives the ANOVA table used to obtain the test statistic value. You should note that the F test statistic and the t test statistic in (g) have the same p-values. What is the relationship between the two test statistic values? (k) Show how the predicted value and the residual are calculated for the first observation in the data set using the least squares regression equation you calculated for the data set. (l) Show how the SSE (Sum of Squares for Error) can be calculated using the residuals. (m) BONUS QUESTION. All previous parts of this equation used one predictor variable at a time. But we can include both predictor variables in a single multiple regression equation using a model of the form Y ~ X1 + X2.. Obtain this multiple regression equation, and use it to obtain a predicted Abrasion Loss for a rubber sample with a Hardness of 60 degrees S and a Tensile Strength of 220 kg/cm2. What is the increase in the proportion of the Total SS explained by the regression model, when you compare the multiple regression model fit to the best simple linear regression model fit? PART B (Groups of 2 or 3 encouraged, but individual is OK too) 2. Refer to the data on “Tensile Strength of Cement”. Hand et al. describe the data set as follows: “The tensile strength of cement depends on (among other things) the length of time for which the cement is dried or ‘cured’. In an experiment, different batches of cement were tested for tensile strength after different curing times. The relationship between curing time and strength is non-linear. Hald regresses log tensile strength on the reciprocal of curing time.” (a) Obtain a scatterplot of tensile strength on curing time. (b) Obtain another scatterplot of log10(tensile strength) on 1/(curing time). (c) Fit a pseudolinear model to the data as plotted in (b) by letting Y = log10(tensile strength) and X = 1/(curing time).] State the fitted model equation, and superimpose the straight line on the scatterplot in (b). You will hand this graph in; don’t forget to label the axes properly and put a title on the graph. (d) Now use the equation from (c) to fit a curve on the scatterplot in (a). You will hand this graph in. What is the equation of the curve you fit? [Hints on how to get R to draw the curve will come in class.] 3. Text. p. 382, Exercise 10.82. 4. The following data comes from Hand et al (1994), Handbook of Small Data Sets, and is described as follows: “Some individuals are carriers of the bacterium Streptococcus pyogenes. To investigate whether there is a relationship between carrier status and tonsil size in schoolchildren, 1398 children were examined and classified according to their carrier status and tonsil size.” Carrier status Carrier Non-carrier Total Tonsil Size Normal 19 497 516 Large 29 560 589 Very Large 24 269 293 Total 72 1326 1398 (a) Perform a contingency table analysis to test if there is evidence of a relationship between tonsil size and carrier status at the 5% level of significance. (b) Use R to obtain the p-value associated with the value of the test statistic. Is the p-value consistent with your conclusion in (a)?
Answered 1 days AfterApr 08, 2021

Answer To: G. Umphrey STAT*2120: Probability and Statistics for Engineers Winter 2021 Assignment #6 Part A is...

Naveen answered on Apr 10 2021
127 Votes
############ Question1 ##############
# Reading data from excel
Abrasion <- read.csv('hand00
6-abrasion-loss-ruzgatu5.csv')
# a)
# Correlation between Abrasion.Loss and Hardness
cor(Abrasion$Abrasion.Loss,Abrasion$Hardness)
# Correlation between Abrasion.Loss and Tensile.Strength
cor(Abrasion$Abrasion.Loss,Abrasion$Tensile.Strength)
# b)
# Scatter plot of Abrasion.Loss and Hardness
plot(Abrasion$Abrasion.Loss,Abrasion$Hardness,
xlab = "Abrasion loss",
ylab = "Hardness",
main = "Scatter plot between Abrasion loss and Hardness")
# Scatter plot of Abrasion.Loss and Tensile.Strength
plot(Abrasion$Abrasion.Loss,Abrasion$Tensile.Strength,
xlab = "Abrasion loss",
ylab = "Tensile Strength",
main = "Scatter plot between Abrasion loss and Tensile Strength")
#c)
# simple linear regression between Abrasion.Loss and...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here