STAT 225 SPRING XXXXXXXXXXHomework XXXXXXXXXXDUE at the beginning of class on Wednesday, April 24, 2019. FOR ONLINE STUDENTS—must turn in via Blackboard by 11:59 pm on April 24, XXXXXXXXXXOr turn in...

File


STAT 225 SPRING 2019 --- Homework 5-------DUE at the beginning of class on Wednesday, April 24, 2019. FOR ONLINE STUDENTS—must turn in via Blackboard by 11:59 pm on April 24, 2019. Or turn in to the course coordinator’s office by 5:00 pm. SHOW WORK!!!!! Full credit will not be given for answers only. NOTE: for any question asking you to determine a probability----you MUST write out a probability statement using proper notation!!!! Probabilities should be DECIMAL form and rounded to 4 decimal places. 1. Consumer Reports magazine presented the following data on the number of calories in a hot dog for each of 18 brands of meat hot dogs: 171 191 182 193 172 147 146 139 174 134 178 153 107 195 135 140 138 155 a) Produce a stem and leaf plot of the data. b) Determine the five-number summary of the above data. c) What is the average number of calories contained in a meat hot dog across all 18 brands? d) What is the variance and standard deviation of the number of calories a hot dog contains? e) Is the data set symmetric, skewed right, or skewed left? Apart from looking at the stem and leaf plot, make sure to explain how the numeric results you obtained in b) and c) can be used to support your claim. f) What is the Inter-Quartile Range of the data? g) Are there any outliers in the data set? Support your claim with calculations. h) Produce a modified box plot of the data. i) What is the 40th percentile? 2. Using the graph below and the labels given, answer the following questions. a) What does the point labeled F represent on the boxplot? b) What does the point labeled A represent on the boxplot? c) How would you calculate the range for this dataset? (use the appropriate letters and give the approximate value) d) Which letter represents the first quartile? Approximately, what is that value? e) How would you calculate the IQR? (use the appropriate letters) f) Between what two letters does the 67th percentile fall? g) What letter represents the median? What is the approximate value for the median? h) Would you expect the mean to be less than the median, greater than the median, or about the same value as the median? i) What is the approximate value of Q3? 3. Using Figures 1 and 2 above answer the following. a) State whether each data set is symmetric, skewed right, or skewed left. b) For each figure, identify each letter a, b, and c as the mean, median, mode, range, or standard deviation. (Choose only one of these per letter, not all are used. Note: the locations of a, b, and c are approximations.) 4. The Fast and the Furious franchise has a series of films focused on illegal street racing and heists. The plot below reveals how lucrative this franchise has become. The dependent variable is worldwide gross in US Dollars in Millions and the independent variable is US Dollars in Millions for budget. The calculated regression line is ?(?)̂ = −185.595 + 6.557 ?. The corresponding coefficient of determination is 0.8155. a) What are the form and direction between budget and gross? b) What is the correlation coefficient between gross and budget? Is the correlation strong, moderate or weak? c) What does the 6.557 in the regression equation represent? What does this mean in terms of the story? d) After seeing the financial success of The Avengers: Infinity War with a 400 million dollar budget. Film executives would like to predict how much The Fast and Furious Franchise will make with a 350 million budget? Is this prediction valid? Why or why not? e) One film had a 125 million dollar budget and a gross of 626 million dollars? What is its residual? f) What percent in the variation gross is explained by the linear relationship with the budget? 0 200 400 600 800 1000 1200 1400 1600 0 50 100 150 200 250 300 W or ld w id e Gr os s i n M ill in os o f U S Do lla rs Budget in US Dollars in Millions Worldwide Gross (US Dollars in Millions) vs. Budget in US Dollars in Millions for Fast and Furious Franchise 5. Given each scenario do the following: (1) identify the appropriate graph you would use to assist you in finding out the information (2) state whether the data are qualitative or quantitative (3) If qualitative-identify as ordinal or nominal, if quantitative—identify as ratio or interval SCENARIOS: a) A poll of 200 undergraduates was taken. The respondents were asked to list all the modes of transportation used going from their home to campus. b) Is the correlation between daily outside temperature and amount of natural gas used for heating positive or negative? c) We want to understand the Indiana yearly harvest (in tons) of soybeans between 1985 and 2015. d) We want to see if there is a relationship between the weight of a passenger vehicle and fuel efficiency (measured as miles per gallon). e) We want to understand the distribution of yearly household incomes in Tippecanoe County for 2012. f) We want to understand distribution the cholesterol level of 100 males aged 20-30 g) What is the percentage of Indiana vehicles that are small passenger cars, large passenger cars, trucks, SUVs, and other types? 6. The grad students are going to celebrate Sophie’s successful PhD defense. So she did a survey among her friends to find out their preferences as to where to go. Each participant in the survey is either a current graduate student or works in industry and only chooses one location. The results are in the table below. Chumley's Black Sparrow Harry's Graduate Student 16 11 15 Works in Industry 18 13 10 a) What is the probability that the participant was a graduate student and wanted to go to the Black Sparrow? Is this a marginal, conditional, or joint probability? b) What is the probability that a participant wanted to go to Harry’s? Is this a marginal, conditional, or joint probability? c) Knowing a participant was a graduate student, what is the probability he/she wanted to go to Chumley's? Is this a marginal, conditional, or joint probability? d) Please complete a table of expected counts. e) Please complete a table of partial ?2 values. f) What is the value of the ?2 statistic? What are its degrees of freedom? g) State the null and alternative hypotheses of Chi-Square (?2) test to determine if there is a relationship between participant work status and location preference. h) Using ? = 0.05, state the conclusion of the Chi-Square (?2) test in the context of this problem. State your reasoning behind your conclusion. 7. Bacteria can develop resistance to antibiotics. To understand what can be treated, a scientist has collected data on values that inhibit the growth of a certain bacteria. It is known that the amount of drug needed, ?, has a Normal distribution with a mean of 5 mg and a standard deviation of 1.1 mg. a) What is the probability ? is greater than 5.5 mg? b) Given that the amount of drug is between 5 mg and 6 mg, what is the probability that it is less than 5.5 mg? c) For the purpose of monitoring resistance, the scientist wants to know the 99th percentile of ?. Find it. d) Without using the Z-table, find the 16th percentile. (Use the EMPIRICAL RULES on this.) e) Without using the Z-table, what is the probability ? is between 2.8 and 6.1? (Use the EMPIRICAL RULES on this.) f) The scientist realizes there is measurement error in his experiment referred to as ?. The measurement error is independent of ?. Through her collaborators, the scientist learns that ?~??????(? = 0, ? = 0.5). What are the distribution, and parameters of ? + ?? 8. A recent survey asked players of the popular game, Fortnite, their age (in years) and how many hours they play in a week. The following data was collected. Age Hours 36 2 17 18 29 12 30 5 23 15 a) Calculate the sample mean of hours. b) Find the sample standard deviation of age. c) Calculate the covariance of age and hours. d) Calculate the correlation of age and hours. Also, what does this correlation indicate about the strength of the linear relationship between age of player and hours played? e) What is the r2 for age and hours? 9. For the following questions a-f, choose the correct graph. Each answer may be used once, more than once, or not at all. a) Which graph clearly shows you the 75th percentile? b) Which graph should only be used for qualitative data? c) Which graph shows you data in which the mean is greater than the median? d) Which graph shows you a distribution that is skewed left? e) Which graph shows data that was collected using a survey with a biased question f) Which graph shows you a distribution that is skewed right? TEXTBOOK PROBLEMS…….. NOTE: copies of the textbook are on reserve in the MATH library (3rd floor of the MATH building) and in the STAT help room ---HAAS 115. 10. Exercise 35.14 (movie length), page 467 (Assume this is normally distributed.) 11. Exercise 36.1 (Haircuts), page 479 12. Exercise 37.22 (Free throws), page 503 (Assume each shot is independent.)
Apr 16, 2021
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here