DEPARTMENT OF ECONOMICS ECON 4041H – RESEARCH METHODOLOGY Fall 2021, Peterborough Assignment #3 Due date: November 17, 2021 Instructions: You must provide your own unique solution. You may work with...

1 answer below »
thanks


DEPARTMENT OF ECONOMICS ECON 4041H – RESEARCH METHODOLOGY Fall 2021, Peterborough Assignment #3 Due date: November 17, 2021 Instructions: You must provide your own unique solution. You may work with others, but each of you is responsible for submitting your own problem set solution. Each question is 50 marks and each part is of equal value. Submit solution through SafeAssign. Sub- mission of one file generated using RMarkdown is best, but acceptable alternatives are allowed. 1. Use the dataset “klemA3.csv” to estimate the aggregate production function for an entire economy. This is a nice application of the basic economic theory of production. Start with a basic Cobb-Douglas production function Y = AKαLβ where Y is output (value-added), K is capital stock, and L is labour. We can estimate this Cobb-Douglas production function as a linear model by log-transforming1 it into log(Y ) = log(A)+α log(K)+β log(L). A more flexible functional form allows for interaction among factors, yielding: log(Y ) = log(A)+α1 log(K)+α2 (log(K))2+β1 log(L)+β2 (log(L))2+γ (log(K)×log(L)). The term log(A) represents a productivity parameter and is the coefficient on the constant in the regression, so relabel it as α0. log(Y ) =α0+α1 log(K)+α2 (log(K))2+β1 log(L)+β2 (log(L))2+γ (log(K)×log(L)). Note the equation has (log(L))2, not log(L2). The variables in the dataset are: • ind: industry label • indnum: an integer identifying an industry • year: year • y: value of gross output ($ millions) • k: value of capital input ($ millions) • l: value of labour input ($ millions) • int: value of intermediate inputs ($ millions) 1log is natural log (or ln) in R. ECON 4041H - Assignment 3 a. Estimate the production function log(Y ) = α0 +α1 log(K)+β1 log(L)+α2 (log(K))2 +β2 (log(L))2 + γ (log(K)×log(L))+ ε where Y is value-added, calculated as gross output minus intermediate inputs (y− int). Report your results, and comment briefly. b. Is the Cobb-Douglas production function sufficient? Or is the full flexible-functional form appropriate? Use a formal test(s) to support you conclusion. c. Generate predicted value-added levels Y using emmeans() for i. mean values of K and L. ii. values of K and L equal to half their mean values. iii. values of K and L equal to twice their mean values. Remember from micro theory that for a function y = aKαLβ , if doubling both inputs yields • double the output, the function displays constant returns to scale. • less than double the output, the function displays diminishing returns to scale. • more than double the output, the function displays increasing returns to scale. Does this estimated production function display decreasing, constant, or increasing re- turns to scale? Note: you specify the values of K and L, and emmeans() will apply the log() transfor- mation. So provide values of K and L in the “at =” parameter, not values of log(K) or log(L). Also, add the option type = “response” as an additional parameter to the emmeans() command. That option will convert the predicted means from log() values back into their values, essentially applying the exp() function to all output. To see that, try it without the option. d. Estimate the marginal products of the two inputs. Since the marginal product of a factor of production is the partial derivative of output Y with respect to the factor ( ∂Y∂K and ∂Y ∂L ), you can use the margins() function for this. i. Estimate the marginal effect of value-added (Y) with respect to capital, and graph the resulting estimates. Just like for emmeans() above, specify the values for K, not log(K), in the “at =” parameter. You will need to specify a vector of values of K to generate a vector of values of ∂Y∂K to graph. Note that because the function takes the log(K), your vector of values of K will have to be a geometric series (1, 2, 4, 8, . . . ; or 10, 100, 1000, . . . ) and not a linear series (1000, 2000, 3000, . . . ). Play around with this until you get it right. ii. Repeat 1.d.i but now with respect to labour L. Same details from above apply. iii. What do the graphs reveal, and are they consistent with the economic theory of production? 2 ECON 4041H - Assignment 3 2. Using the labour force survey file, “lfs21.rds”, explore whether wages differ for immigrants and non-immigrants. We will also explore interaction effects of immigrant status with other potential explanatory variables. Some data processing is required. • The variable immig, has three categories. Two of the categories are for immigrants and identify time since they arrived in Canada. The third category represents non- immigrants. Recode the two immigrant categories into one, so that the new variable is a binary categorical variable identifying immigrant status only. In other words, the new variable will combine the two “Immigrant” status categories into one and the variable will be coded “immigrant” and “non-immigrant”. Let’s identify this new variable as im2. • The variable union has three categories: union member, not unionized by under a col- lective agreement, and non-unionized. Recode this variable into a new binary variable combining the first two categories together into one that captures presence of a collec- tive agreement. The variable will now code as either covered by a collective agreement or non-unionized. Let’s refer to it as ca2. • Convert age_12 into a numeric variable, and drop the top age category “70 and over”. Let’s refer to it as age. a. Run a regression with wages (hrlyearn) as the dependent variable, and use the follow- ing following variables as explanatory variables: the numeric age in both linear and quadratic terms, education (educ), sex, sector of employment (cowmain), collective agreement status (ca2 from above), firmsize, immigrant status (im2 from above), and province (prov). You will have nine explanatory variables, including age as both linear and quadratic. Discuss the estimates and discuss what the coefficients mean. Provide a complete discussion for all but province. We will address province next. b. Do wages differ by province? Do they differ for every province, or are some similar? Answer this part using both lht() and emmeans(). c. Now interact the immigration status variable with all the categorical variables in the model except province, so use: educ, sex, cowmain, ca2, and firmsize. Very briefly characterize the interaction terms. d. Explaining interactions is challenging, so now use emmeans() to calculate the interac- tion effect of immigration status on wages for each of the other five categorical explana- tory variables with which im2 is interacted. Run the interaction for each separately, ie. first run emmeans() for immigration status and education, then immigration status and sex, etc. Feel free to add any additional analysis that helps provide explanation. It is often useful to graph the results of emmeans() (hint, hint). You may also find the contrast() function helpful after running emmeans(). I leave this part a bit open-ended and invite you to explore these economic relationships using the tools we have been reviewing. 3
Answered 3 days AfterNov 11, 2021

Answer To: DEPARTMENT OF ECONOMICS ECON 4041H – RESEARCH METHODOLOGY Fall 2021, Peterborough Assignment #3 Due...

Mohd answered on Nov 15 2021
109 Votes
Reg
Reg
Bassi
11/14/2021
library(readr)
library(magrittr)
library(dplyr)
library(ggplot2)
library(rmarkdown)
library(MASS)
library(skimr)
library(ggeffects)
library(readr)
klema3 <- read_csv("~/data/klema3.csv")
#View(klema3)
Descriptive stats
skim(klema3)
Data summary
    Name
    klema3
    Number of rows
    4420
    Number of columns
    7
    _______________________
    
    Column type frequency:
    
    character
    1
    numeric
    6
    ________________________
    
    Group variables
    None
Variable type: character
    skim_variable
    n_missing
    complete_rate
    min
    max
    empty
    n_unique
    whitespace
    ind
    0
    1
    5
    66
    0
    65
    0
Variable type: numeric
    skim_variable
    n_missing
    complete_rate
    mean
    sd
    p0
    p25
    p50
    p75
    p100
    hist
    indnum
    0
    1
    33.00
    18.76
    1.00
    17.00
    33.00
    4
9.00
    65.0
    ▇▇▇▇▇
    year
    0
    1
    1980.50
    19.63
    1947.00
    1963.75
    1980.50
    1997.25
    2014.0
    ▇▇▇▇▇
    y
    0
    1
    140818.40
    257726.32
    230.56
    12308.21
    45605.68
    147478.41
    2820728.0
    ▇▁▁▁▁
    k
    0
    1
    33430.09
    109737.20
    20.90
    2063.38
    6984.37
    24506.90
    1939544.1
    ▇▁▁▁▁
    l
    0
    1
    44697.50
    90442.06
    14.52
    3194.63
    13133.90
    40102.09
    790195.5
    ▇▁▁▁▁
    int
    0
    1
    62690.81
    103330.22
    15.65
    5241.25
    20151.50
    71683.25
    788034.0
    ▇▁▁▁▁
Q1(a)*(b)
klema3$log_y<-log(klema3$y)
klema3$log_l<-log(klema3$l)
klema3$log_k<-log(klema3$k)
k_mod<-lm(log_y~I(log_k)+I((log_k)^2)+I(log_l)+I((log_l)^2)+I(log_k*log_l),data=klema3)
summary(k_mod)
##
## Call:
## lm(formula = log_y ~ I(log_k) + I((log_k)^2) + I(log_l) + I((log_l)^2) +
## I(log_k * log_l), data = klema3)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.78160 -0.22818 -0.05519 0.18501 1.90069
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.057833 0.108113 19.03<2e-16 ***
## I(log_k) 0.402945 0.029025 13.88<2e-16 ***
## I((log_k)^2) 0.058295 0.002982 19.55<2e-16 ***
## I(log_l) 0.491054 0.027102 18.12 <2e-16 ***
## I((log_l)^2) 0.058983 0.003497 16.86<2e-16 ***
## I(log_k * log_l) -0.113367 0.006044 -18.76<2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3494 on 4414 degrees of freedom
## Multiple R-squared: 0.959, Adjusted R-squared: 0.959
## F-statistic: 2.067e+04 on 5 and 4414 DF, p-value: < 2.2e-16
Q1(C)
emmeans::emmeans(k_mod,specs = c("log_l","log_k"),at=c("l","k"),type="response")
## log_l log_k emmean SE df lower.CL upper.CL
## 9.324 8.869 10.55 0.007373 4414 10.53 10.56
##
## Confidence level used: 0.95
plot(k_mod)
Q1.(d)
a2_1<-ggpredict(k_mod,c("log_l","log_k"))
a2_1
## # Predicted values of log_y
##
## # log_k = 7.07
##
## log_l | Predicted | 95% CI
## ----------------------------------
## 2.68 | 7.41 | [ 7.23, 7.59]
## 7.47 | 8.79 | [ 8.78, 8.81]
## 8.60 | 9.51 | [ 9.49, 9.53]
## 9.48 | 10.18 | [10.15, 10.21]
## 10.23 | 10.82 | [10.77, 10.87]
## 13.58 | 14.48 | [14.24, 14.72]
##
## # log_k = 8.87
##
## log_l | Predicted | 95% CI
## ----------------------------------
## 2.68 | 9.26 | [ 8.97, 9.55]
## 7.47 | 9.67 | [ 9.64, 9.69]
## 8.60 | 10.16 | [10.14, 10.17]
## 9.48 | 10.64 | [10.63, 10.66]
## 10.23 | 11.13 | [11.11, 11.15]
## 13.58 | 14.11 | [13.98, 14.24]
##
## # log_k = 10.67
##
## log_l | Predicted | 95% CI
## ----------------------------------
## 2.68 | 11.49 | [11.06, 11.93]
## 7.47 | 10.92 | [10.84, 10.99]
## 8.60 | 11.18 | [11.14, 11.21]
## 9.48 | 11.48 | [11.46, 11.51]
## 10.23 | 11.82 | [11.80, 11.84]
## 13.58 | 14.11 | [14.05, 14.18]
plot(a2_1)
lfs21 <- readRDS("~/data/lfs21.rds")
lfs<-lfs21%>%
filter(age_12!="70 and over")
lfs$age=lfs$age_12
lfs$wage=lfs$hrlyearn
lfs$age=as.numeric(lfs$age)
lfs$wage=as.numeric(lfs$wage)
lfs<-lfs%>%
mutate(im2=ifelse(immig=="Non-immigrant","Non-immigrant","Immigrant"))%>%
mutate(ca2=ifelse(union=="Non-unionized","Non-unionized","Covered-Collective-agreement"))
#%>%
# mutate_if(is.character,factor)
Q.2(a).
l_mod<-lm(wage~age+I(age^2)+educ+sex+ca2+im2+cowmain+firmsize+prov,data=lfs)
summary(l_mod)
##
## Call:
## lm(formula = wage ~ age + I(age^2) + educ + sex + ca2 + im2 +
## cowmain + firmsize + prov, data = lfs)
##
## Residuals:
## Min 1Q Median 3Q Max
## -43.215 -7.428 -1.473 5.525 77.620
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.982917 0.510466 7.803 6.14e-15
## age 4.774620 0.072091 66.230< 2e-16
## I(age^2) -0.317012 0.005913 -53.609< 2e-16
## educSome high school 1.261637 0.406858 3.101 0.00193
## educHigh school graduate 2.129278 0.387420 5.496 3.90e-08
## educSome postsecondary 3.197833 0.411303 7.775 7.64e-15
## educPostsecondary certificate or diploma 5.723664 0.382207 14.975< 2e-16
## educBachelor's degree 11.948389 0.389321 30.690< 2e-16
## educAbove bachelor's degree 16.855738 0.403659 41.757< 2e-16
## sexFemale -5.201711 0.083482 -62.309< 2e-16
## ca2Non-unionized 0.336650 0.110664 3.042 0.00235
## im2Non-immigrant 4.331353 0.110756 39.107< 2e-16
## cowmainPrivate sector employees -4.155718 0.119660 -34.729< 2e-16
## firmsize20 to 99 employees 1.956708 0.140281 13.949< 2e-16
## firmsize100 to 500 employees 3.141274 0.143731 21.855< 2e-16
## firmsizeMore than 500 employees 4.932237 0.121355 40.643< 2e-16
## provPrince Edward Island -2.032285 0.335886 -6.051 1.45e-09
## provNova Scotia -1.760881 0.295396 -5.961 2.52e-09
## provNew Brunswick -2.217299 0.292218 -7.588 3.29e-14
## provQuebec 1.337800 0.252766 5.293 1.21e-07
## provOntario 3.547768 0.248251 14.291< 2e-16
## provManitoba 0.660738 0.267316 2.472 0.01345
## provSaskatchewan 2.651351 0.280268 9.460< 2e-16
## provAlberta 6.344972 0.267541 23.716< 2e-16
## provBritish Columbia 3.918443 0.266103 14.725< 2e-16
##
## (Intercept) ***
## age ***
## I(age^2) ***
## educSome high school **
## educHigh school graduate ***
## educSome postsecondary ***
## educPostsecondary certificate or diploma ***
## educBachelor's degree ***
## educAbove bachelor's degree ***
## sexFemale ***
## ca2Non-unionized **
## im2Non-immigrant ***
## cowmainPrivate sector employees ***
## firmsize20 to 99 employees ***
## firmsize100 to 500 employees ***
## firmsizeMore than 500 employees ***
## provPrince Edward Island ***
## provNova Scotia ***
## provNew Brunswick ***
## provQuebec ***
## provOntario ***
## provManitoba *
## provSaskatchewan ***
## provAlberta ***
## provBritish Columbia ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here