Q1 [70 pts]
Star Field is home to the Lions professional baseball team. The team’s new marketing director, Janna Kay, has been trying to develop a better understanding of the key drivers of attendance at the stadium to increase ticket revenues, optimize concession inventories and staffing, and schedule the timing of promotional giveaways. Using some historical data on a set of information, we consider model 1:
Attendance=b0+b1 nightgame+b2 temp_f+b3 sunday+b4 saturday+b5 friday+b6 promo+b7 openingday+b8 school+u,
where Attendance is the total attendance of the game, temp_f is the temperature of the game day, nightgame is a dummy variable indicating whether game is played during the night, Friday, Saturday and Sunday are the dummies for the day of the week, promo, opening_day and school are dummies indicating whether there are some promotional activities, whether it is the opening day, and whether local public school system is in session respectively.
1. Import the baseball data first. Estimate model 1 and summarize the result using stargazer. According to the estimation results, which day of the week usually has the highest attendance? Why?
2. How much is the attendance expected to change if the temperature on the game day becomes 10 degrees higher?
3. What is the estimated difference of the average attendance between night games and daytime games?
*Suppose that we believe the attendance is not always increasing as the temperature increases, and we add the square term of temperature as in model 2:
attendance=b0+b1 nightgame+b2 temp_f+b3 sunday+b4 saturday+b5 friday+b6 promo+b7 openingday+b8 school+b9 temp_f^2+u
4. Estimate model 2. Does the result support our conjecture or not?
5. Using the estimated coefficient of temp_f and temp_f^2, briefly explain how does the average attendance change as the temperature increases.
6. According to the results, do you want to keep temp_f^2 in the model or not? Why? (Hint: what is the relevant hypothesis test here?)
*Lastly, we consider model 3 by using the logarithm of attendance as the dependent variable. The set of regressors are the same as model 2.
7. Estimate model 3 and summarize model 1,2,3 using stargazer. Interpret the estimated coefficient of opening_day.
Q2 [30 pts]
1. Download Amazon and Google’s daily stock price data from XXXXXXXXXXto XXXXXXXXXXYou can do so in Yahoo Finance (https://finance.yahoo.com/quote/AMZN/history?p=AMZN) or other websites. Use the close price.
2. Import the data you downloaded into R. Plot the time series of Amazon and Google stock prices using different colors.