DASC 512, Homework 5 Instructions: Each problem will require you to perform a simple regression analysis. A regression analysis includes the following information/steps: ˆ Hypothesized Model: State...

1 answer below »
Looking for a word document and a .py for any python charts. Request use of python for any plotting, charts, ect.


DASC 512, Homework 5 Instructions: Each problem will require you to perform a simple regression analysis. A regression analysis includes the following information/steps: ˆ Hypothesized Model: State your hypothesized model as ŷ = β0 +β1x with numerical values for β0 and β + 1 ˆ A scatterplot showing your hypothesized model overlaying the original data. ˆ Parameter estimates and confidence intervals where requested (unless otherwise noted, use the t- distribution for tests/intervals) ˆ A test for validity in your model (either slope or coefficient of correlation), remember all portions of a hypothesis test – Null and Alternative Hypothesis – Test Statistic – Either p-value or a critical value to compare – Correctly stated conclusion ˆ Predictions and estimates (where requested) for your model. ˆ For now, you do not need to worry about validating assumptions. 1. Consider the following hospital data. We would like to determine if the average length of stay can predict the average hospital charge and use it to make a prediction for the charge when a patient stays 4 days. Perform a regression analysis to determine the answer to this question, use α = 0.10 when required. State Average Charge Average Length of Stay (days) Massachusetts 11680 3.64 New Jersey 11630 4.2 Pennsylvania 9850 3.84 Minnesota 9950 3.11 Indiana 8490 3.86 Michigan 9020 3.54 Florida 13820 4.08 Georgia 8440 3.57 Tennessee 8790 3.8 Texas 10400 3.52 Arizona 12860 3.77 California 16740 3.78 2. Using the ‘hofbatting.csv’ file from Homework 1, conduct a regression analysis to determine if OBP can be used to predict SLG. (a) Show that this model is useful. (b) What is the expected slugging percentage for a player if they have an On-Base percentage of 0.40, give an α = 0.05 confidence interval? (c) What would you expect the slugging percentage of a new inductee with an OBP of 0.40, give an α = 0.05 confidence interval? (d) We previously identified Willard Brown as an outlier, redo the analysis from the previous problems excluding his data, did it make a difference? 3. In baseball, it is hypothesized that we can use the run differential to predict the number of wins a team will have by the end of the season. Use the file ‘TeamData.csv’ to test this concept. HW 5, DASC 512, Page 1 (a) Create a column of data for Run Differential (R−RA) and a column for Win Percentage (W/(W+ L)). Use these values to determine if the Run Differential can be used to predict the percentage of wins a team will end up with. (b) Create an plot showing the confidence intervals for mean estimation and prediction overlayed on the original scatterplot (reduce the dot size so that I can see the lines/zones). (c) Bill James, the godfather of sabermetrics, emperically derived a non-linear formula to estimate winning percentage called the Pythagorean Expectation. Wpct = R2 R2 +RA2 Create a new variable representing R 2 R2+RA2 , the pythagorean model. Now use this new column to replace the Run Differential and re-run your analysis. (d) The 2001 Seattle Mariners has 116 wins and 46 losses with a +300 Run Differential (in the data). Create a CI for this outcome for each of the models. (e) Which of these models is better and why? HW 5, DASC 512, Page 2 Rk,,Inducted,Yrs,From,To,ASG,WAR/pos,G,PA,AB,R,H,2B,3B,HR,RBI,SB,CS,BB,SO,BA,OBP,SLG,OPS 1,Hank Aaron HOF,1982,23,1954,1976,25,137.3,3298,13941,12364,2174,3771,624,98,755,2297,240,73,1402,1383,0.305,0.374,0.555,0.928 3,Roberto Alomar HOF,2011,17,1988,2004,12,62.9,2379,10400,9073,1508,2724,504,80,210,1134,474,114,1032,1140,0.3,0.371,0.443,0.814 6,Cap Anson HOF,1939,27,1871,1897,0,91.1,2524,11331,10281,1999,3435,582,142,97,2075,277,16,984,330,0.334,0.394,0.447,0.841 7,Luis Aparicio HOF,1984,18,1956,1973,13,51.7,2599,11230,10230,1335,2677,394,92,83,791,506,136,736,742,0.262,0.311,0.343,0.653 8,Luke Appling HOF,1964,20,1930,1950,7,69.9,2422,10254,8856,1319,2749,440,102,45,1116,179,108,1302,528,0.31,0.399,0.398,0.798 9,Richie Ashburn HOF,1995,15,1948,1962,6,60.2,2189,9736,8365,1322,2574,317,109,29,586,234,92,1198,571,0.308,0.396,0.382,0.778 10,Earl Averill HOF,1975,13,1929,1941,6,45.1,1669,7221,6353,1224,2019,401,128,238,1164,70,58,774,518,0.318,0.395,0.534,0.928 11,Home Run Baker HOF,1955,13,1908,1922,0,59.5,1575,6663,5984,887,1838,315,103,96,987,235,54,473,346,0.307,0.363,0.442,0.805 12,Dave Bancroft HOF,1971,16,1915,1930,0,46.5,1913,8248,7182,1048,2004,320,77,32,591,145,75,827,487,0.279,0.355,0.358,0.714 13,Ernie Banks HOF,1977,19,1953,1971,14,62.5,2528,10394,9421,1305,2583,407,90,512,1636,50,53,763,1236,0.274,0.33,0.5,0.83 14,Jake Beckley HOF,1971,20,1888,1907,0,57.1,2389,10504,9538,1602,2934,473,244,87,1578,315,,616,524,0.308,0.361,0.436,0.797 15,Johnny Bench HOF,1989,17,1967,1983,14,72.3,2158,8674,7658,1091,2048,381,24,389,1376,68,43,891,1278,0.267,0.342,0.476,0.817 17,Yogi Berra HOF,1972,19,1946,1965,18,56.1,2120,8359,7555,1175,2150,321,49,358,1430,30,26,704,414,0.285,0.348,0.482,0.83 19,Wade Boggs HOF,2005,18,1982,1999,12,88.3,2440,10740,9180,1513,3010,578,61,118,1014,24,35,1412,745,0.328,0.415,0.443,0.858 20,Jim Bottomley HOF,1974,16,1922,1937,0,32.8,1991,8354,7471,1177,2313,465,151,219,1422,58,15,664,591,0.31,0.369,0.5,0.869 21,Lou Boudreau HOF,1970,15,1938,1952,8,59.1,1646,7024,6029,861,1779,385,66,68,789,51,50,796,309,0.295,0.38,0.415,0.795 22,Roger Bresnahan HOF,1945,17,1897,1915,0,39.1,1446,5374,4481,682,1252,218,71,26,530,212,4,714,405,0.279,0.386,0.377,0.764 23,George Brett HOF,1999,21,1973,1993,13,84,2707,11625,10349,1583,3154,665,137,317,1596,201,97,1096,908,0.305,0.369,0.487,0.857 24,Lou Brock HOF,1985,19,1961,1979,6,42.8,2616,11240,10332,1610,3023,486,141,149,900,938,307,761,1730,0.293,0.343,0.41,0.753 25,Dan Brouthers HOF,1945,19,1879,1904,0,76.9,1673,7676,6711,1523,2296,460,205,106,1296,256,,840,238,0.342,0.423,0.519,0.942 27,Willard Brown HOF,2006,1,1947,1947,0,-0.7,21,67,67,4,12,3,0,1,6,2,2,0,7,0.179,0.179,0.269,0.448 29,Jesse Burkett HOF,1946,16,1890,1905,0,60.5,2067,9620,8426,1720,2850,320,182,75,952,389,,1029,613,0.338,0.415,0.446,0.861 30,Roy Campanella HOF,1969,10,1948,1957,8,31.6,1215,4815,4205,627,1161,178,18,242,856,25,15,533,501,0.276,0.36,0.5,0.86 31,Rod Carew HOF,1991,19,1967,1985,18,76.6,2469,10550,9315,1424,3053,445,112,92,1015,353,187,1018,1028,0.328,0.393,0.429,0.822 32,Max Carey HOF,1961,20,1910,1929,0,51.1,2476,10770,9363,1545,2665,419,159,70,800,738,109,1040,695,0.285,0.361,0.386,0.747 34,Gary Carter HOF,2003,19,1974,1992,11,66.4,2296,9019,7971,1025,2092,371,31,324,1225,39,42,848,997,0.262,0.335,0.439,0.773 35,Orlando Cepeda HOF,1999,17,1958,1974,11,46.1,2124,8698,7927,1131,2351,417,27,379,1365,142,80,588,1169,0.297,0.35,0.499,0.849 36,Frank Chance HOF,1946,17,1898,1914,0,43.5,1288,5103,4299,798,1274,200,79,20,596,403,0,556,320,0.296,0.394,0.394,0.788 38,Fred Clarke HOF,1945,21,1894,1915,0,64.2,2246,9838,8584,1622,2678,361,220,67,1015,509,0,875,509,0.312,0.386,0.429,0.814 40,Roberto Clemente HOF,1973,18,1955,1972,15,89.8,2433,10211,9454,1416,3000,440,166,240,1305,83,46,621,1230,0.317,0.359,0.475,0.834 41,Ty Cobb HOF,1936,24,1905,1928,0,144.9,3034,13078,11434,2246,4189,724,295,117,1938,897,212,1249,680,0.366,0.433,0.512,0.945 42,Mickey Cochrane HOF,1947,13,1925,1937,2,48.9,1482,6207,5169,1041,1652,333,64,119,832,64,45,857,217,0.32,0.419,0.478,0.897 43,Eddie Collins HOF,1939,25,1906,1930,0,118.5,2826,12040,9949,1821,3315,438,187,47,1300,741,195,1499,468,0.333,0.424,0.429,0.853 44,Jimmy Collins HOF,1945,14,1895,1908,0,50.1,1725,7452,6795,1055,1999,352,116,65,983,194,,426,266,0.294,0.343,0.409,0.752 45,Earle Combs HOF,1970,12,1924,1935,0,40,1455,6513,5746,1186,1866,309,154,58,632,98,71,670,278,0.325,0.397,0.462,0.859 48,Roger Connor HOF,1976,18,1880,1897,0,80.6,1998,8847,7797,1620,2467,441,233,138,1323,244,,1002,455,0.316,0.397,0.486,0.883 50,Sam Crawford HOF,1957,19,1899,1917,0,69.9,2517,10594,9570,1391,2961,458,309,97,1525,367,43,760,579,0.309,0.362,0.452,0.814 51,Joe Cronin HOF,1956,20,1926,1945,7,61.9,2124,8840,7579,1233,2285,515,118,170,1424,87,71,1059,700,0.301,0.39,0.468,0.857 53,Kiki Cuyler HOF,1968,18,1921,1938,1,44.4,1879,8100,7161,1305,2299,394,157,128,1065,328,27,676,752,0.321,0.386,0.474,0.86 54,George Davis HOF,1998,20,1890,1909,0,80.2,2372,10178,9045,1545,2665,453,163,73,1440,619,,874,610,0.295,0.362,0.405,0.767 55,Andre Dawson HOF,2010,21,1976,1996,8,60.6,2627,10769,9927,1373,2774,503,98,438,1591,314,109,589,1509,0.279,0.323,0.482,0.806 57,Ed Delahanty HOF,1945,16,1888,1903,0,66.5,1837,8400,7511,1600,2597,522,186,101,1466,455,,741,439,0.346,0.411,0.505,0.916 58,Bill Dickey HOF,1954,17,1928,1946,11,52.4,1789,7064,6300,930,1969,343,72,202,1209,36,32,678
Answered 1 days AfterAug 04, 2021

Answer To: DASC 512, Homework 5 Instructions: Each problem will require you to perform a simple regression...

Suraj answered on Aug 05 2021
146 Votes
DASC 512, Homework 5
Solution 1:
The first linear regression model is created to predict the Avera
ge charge in a hospital using the average length to stay in the hospital in Python using OLS function. The OLS function based on the ordinary least square method. Also known as linear regression.
The regression model works best without the intercept term. The regression model is given as follows:
Average Charge = 2944.6631*Average length of stay
The R square value of model is 0.959. Thus, this is best model to make predictions.
The prediction of average charge with 4 days stay is given as follows:
Average Charge = 2944.6631*4
= $11778.65
Solution 2:
The second model is based on the data set Hofbatting.csv. Here, we want to check that can we predict SLG by using OBP as explanatory variable.
a)
The regression model is fitted very well with 0.987 R square value. Thus, this is the best fitted model.
b)
To...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here