Exercise 1: Arithmetic of OLSThe point of this exercise is to demonstrate some of the arithmetic properties ofresiduals, fitted values, and the R2 statistic discussed in Section 2.3. Run the regression of weight on height. Interpret the slope and intercept. Create the residuals and confirm they have mean zero (see Eq 2.14) Create the fitted values (Yhats), and confirm that the average of Yhat is thesame as the average of Y Confirm that the residuals are uncorrelated with X and with fitted values Confirm that R2 is the ratio of the variance of the fitted values to var(Y) Confirm that R2 is the square of the correlation between fitted values and Y Calculate SSE, SSR, and SST and confirm that these are same as Stata reports inthe upper left as “Model”, “Residual” and “Total”. Recall that SSE = (n1)*Var(Yhat)Exercise 2: Conditional Means and the Standard Error of Regression (SER)The point of this exercise is to demonstrate that regression is about calculatingconditional expectations; you will also calculate the standard error of the regression(SER) and the standard error of1ˆusing the formula from Section 2.5. Run a regression of height on female Use summarize to find the average height for men and confirm that it equals theintercept Use summarize to find the average height for women and confirm that theregression coefficient on female is the difference between the male and femaleaverage heights Generate the residuals and apply Eq. 2.61 to find the standard error of theregression (SER). Confirm this comes out the same as what Stata reports as theRoot Mean Square Error (Root MSE). Use this result to find the standard error of the parameter estimate for heightand confirm that this is what Stata reports. Recall that SSTx = (n-1)*Var(x).- 2 -Exercise 3: Units of MeasurementThis exercise comes from Section 2.4, and looks at the effect of changing units ofmeasurement. Download and open the dataset CEOSAL1.DTA. Study Example 2.3, and recreate the regression results. State carefully whatthe slope coefficient means, specifying the units of measurement. Now follow the steps described for equation 2.40. Generate a variable thatmeasures salary in dollars instead of 1000s of dollars. (So multiply it by1000). Confirm that if you use this new variable as the y-variable, theestimated slope and intercept both increase by a factor of 1000. Statecarefully what the slope coefficient means, specifying the units ofmeasurement. Does this transformation change the interpretation of theregression results? Does R2 change? Now generate a variable that measures roe in decimals instead of percentages.(So divide it by 100). Confirm that if you use this new variable as the xvariable,the intercept does not change, but the slope increases by a factor of100. State carefully what the slope coefficient means, specifying the units ofmeasurement. Does this transformation change the interpretation of theregression results? Does R2 change?Exercise 4: The Log-Level Functional FormThis exercise also comes from Section 2.4, and looks at the effect of changing thefunctional form of a regression, using logs for the y-variable only. A linearrelationship between education and log(wage) is a nonlinear relationship betweeneducation and wage (see Figure 2.6). Download the dataset WAGE1.DTA, and recreate the regression results fromExample 2.4. Generate a variable that measures the hourly wage in logarithms, and recreatethe results from Example 2.10.These two regressions have fundamentally different interpretations: in one case weimpose an assumption that each year of schooling raises wages by a constant dollaramount; in the second case we assume each year of schooling raises wages by a constantpercentage amount. This makes more sense: a year of grad school should be worth morethan a year of elementary school.
Already registered? Login
Not Account? Sign up
Enter your email address to reset your password
Back to Login? Click here