Problem 1. The data frame Dryers found in Dryers.RData has 36 rows (observations) and 3 columns (variables). The data come from an experiment in which various types of clothing were dried in various...

See assignment attachment. Use R data frame attachments.


Problem 1. The data frame Dryers found in Dryers.RData has 36 rows (observations) and 3 columns (variables). The data come from an experiment in which various types of clothing were dried in various types of dryers. The variables are the following: Clothing One of three types of clothing: Towels, Jeans, or Thermal Clothing Dryer One of four types of dryers: Electric, BD (bi-directional) Electric, Town Gas, or LPD kWh.kg Energy effectiveness measured in kilowatt hours per kilogram of clothing dried For each of the 12 settings of Clothing and Dryer there are 3 replications, giving a total sample size of 36. (a) Fit a two-way ANOVA model where kWh.kg is the response variable and Clothing and Dryer are factors. Conduct the sequence of hypothesis tests that examines the main effects and their interactions using a test level of α = .05. (b) State the assumptions of the classical ANOVA model. Provide and interpret a set of diagnostic plots that address their validity. (c) Assuming that the model you fit in (a) is valid, conduct multiple comparisons of dryers to identify a single dryer or group of dryers that is the most energy efficient. (d) Find the “best” Box-Cox transformation of kWh.kg and redo (a) with this transformation. Does the transformed model have better diagnostics than the original model? Problem 2. The data set Mutual.RData contains two objects, mu and V, that are the mean vector and covariance matrix of annual rates of return of five investment funds. Here is mu: SP500 HighTech SmallCap USTreas CorpBond 0.06 0.10 0.08 0.02 0.04 This says, for example, that the annual mean return on investment for the US Treasuries fund is 2 percent. Assume that the rates of return are jointly distributed as multivariate normal. Suppose that Tom and Ellen each invest $1000 at the beginning of the year. Tom puts $500 in the High Tech fund and $500 in the Small Cap fund. Ellen puts $200 in each of the five funds. What is the probability that Ellen will have made more money than Tom at the end of the year? Problem 3. The data frame BCMort88 found in the file BCMort88.RData gives breast cancer mortality rates for 217 counties in nine states (six New England states plus NY, NJ, and PA) for the year 1988. The rates are adjusted to account for differing demographic characteristics across the counties. Here is a data description: Variables: Pop = Population of county AdjRate = Adjusted mortality rate (per 100,000) SE = Estimated standard error (per 100,000) We want to identify counties that have mortality rates that are significantly more than 18 per 100,000 population, in order to devote resources to them. For this problem assume that the adjusted rates are unbiased and normally distributed although the latter clearly is not the case for smaller counties. Limit your analysis to those counties that have a population of at least 20,000. Use a multiple testing method to identify those counties that should receive resources according to the criterion stated above and explain why you chose it over alternative methods. Problem 4. The data frame MTVR found in the file MTVR.RData gives information on n = 112 USMC Medium Terrain Vehicle Replacements (MTVRs) at Camp Lejeune, NC. Here is a data description: Data on n = 112 MTVRs taken from internal Caterpillar engine diagnostic readings. Variables are as follows: PTO Percent of time in Power Takeoff (PTO) mode Idle Percent of time in idle mode Miles Number of miles driven Load.factor Percent of max. available power used by the engine MPG Fuel efficiency (miles per gallon) Source: Penn State Applied Research Laboratory, 2013 (a) Use the pairs() command to identify the most obvious outlier: describe what it is and whether you believe it would be justified to delete this observation. (b) Fit a least-square regression model to predict MPG from the other variables both with and without the outlier included. Does the outlier have a large effect on the fitted model? (c) Produce a 95% lower confidence bound (LCB) for the true regression coefficient on Load.factor (i) before removing the outlier and (ii) after removing the outlier.
Nov 19, 2021
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here