DCT 90102: Econometric Methods for Business Research I DCT 90102: Econometric Methods for Business Research I Sebastiano Manzan, Fall 2021 Assignment 1: CEO compensation • The weight of this problem...

1 answer below »
Can you assist with this assignment?


DCT 90102: Econometric Methods for Business Research I DCT 90102: Econometric Methods for Business Research I Sebastiano Manzan, Fall 2021 Assignment 1: CEO compensation • The weight of this problem set is 10% of the overall grade • The problem set is done individually, but you can discuss it with other students • The assignment should be completed in R • Create a Word document that contains a section for each question and copy the tables and graphs produced in R. Provide also a discussion of the results • Submit the Word file in Blackboard under Assignment 1 by 8.30am on Friday October 22, 2021 The goal of this assignment is to practice with the probability and statistics concepts that were discussed in the first residency. These concepts will be applied to conduct a preliminary analysis of a CEO compensation dataset for 2016. The variables in the dataset include: • EXEC_FULLNAME: executive full name • CONAME: company name • TICKER: current ticker symbol of the company. • GENDER: gender of the executive officer • TDC1: total compensation (salary + bonus + other annual + restricted stock grants + LTIP payouts + all other + value of option grants; unit: thousand) • SALES: sales (unit: millions) • SALECHG: sales 1 yr. percent change (unit: percentage) • NI: net income (after extraordinary items and discontinued operations; unit: millions) • ROEPER: return on equity (unit: percentage; net income before extraordinary items and discontinued operations divided by total common equity) • ASSETS: assets (unit: millions) • ROA: return on assets (unit: percentage; net income before extraordinary items and discontinued oper- ations divided by total assets) • MKTVAL: market value (fiscal-year end; unit: millions) • AGE: executive’s age • EMPL: employees (unit: thousands) The file is in csv format and you can use read.csv() to import it. Familiarize with the dataset by printing the first 10 rows and then answer the following questions: Q1. (20%) Let’s get an idea of how the distribution of the random variable CEO compensation looks like. For the variable TDC1, calculate the mean, median, standard deviation, skewness, kurtosis, minimum, and maximum. Create a table with these statistics and discuss the results. Q2. (20%) Plot a histogram of TDC1 together with a normal distribution with mean and standard deviation estimated from the sample (follow the R commands provided below to get started). Describe the shape of the 1 histogram and discuss whether the normal distribution provides a good fit to the histogram. (Copy/Paste the plot to the Word document) # below I called the data object "ceo.data" library(ggplot2) ggplot(ceo.data) + geom_histogram(aes(x = TDC1, y = ..density..), bins = 50, fill = "tomato2") + stat_function(fun = dnorm, args = list(mean = mean(ceo.data$TDC1), sd = sd(ceo.data$TDC1)), color = "dodgerblue2") + xlim(-10000, 50000) + theme_bw() Q3. (20%) Calculate the correlation between the numerical variables in the dataset (that is, the columns from TDC1 to EMPL). Create a new object that contains only the numerical columns by following the example below: # assume you imported the dataset and called the object ceo.data and then # create a new object with only the numerical variables using subset() (read help file) ceo.data.subset <- subset(ceo.data, select=tdc1:empl) # so that the head(ceo.data.subset, 2) looks like this head(ceo.data.subset, 2) tdc1 sales salechg ni roeper assets roa mktval age empl 1 11141 40180 -1.98 2676 70.70 51274 5.22 24191 54 122.3 2 1019 1313 9.80 126 5.67 2477 5.08 2748 67 10.8 calculate the correlation of ceo.data.subset using the function cor(ceo.data.subset, use="complete.obs"). discuss the results, in particular having in mind the goal of finding factors to explain tdc1 (copy the correlation table to your word document) q4. (20%) do a scatter plot between tdc1 in the y-axis and another variable of your choice in the x-axis. customize the graph by 1) labelling the x and y axis with a short description of the variable (rather than the default ceo.data$tdc1), 2) change the color of the dots to red, and 3) the size of the dots with argument cex = 0.5. discuss the scatter plot in terms of the dependence between the x and y variables and the existence and effect of outliers1 in the variables considered. (copy/paste the plot to the word document) q5. (20%) test the null hypothesis that the population growth rate of sales (salechg) is equal to zero against the alternative that is different from zero at 5% significance level. provide the calculation and discuss what you learn from testing this hypothesis. 1an outlier is an observation that is many (at least 4) standard deviations away from the mean. 2 assignment 1: ceo compensation subset(ceo.data,="" select="TDC1:EMPL)" #="" so="" that="" the="" head(ceo.data.subset,="" 2)="" looks="" like="" this="" head(ceo.data.subset,="" 2)="" tdc1="" sales="" salechg="" ni="" roeper="" assets="" roa="" mktval="" age="" empl="" 1="" 11141="" 40180="" -1.98="" 2676="" 70.70="" 51274="" 5.22="" 24191="" 54="" 122.3="" 2="" 1019="" 1313="" 9.80="" 126="" 5.67="" 2477="" 5.08="" 2748="" 67="" 10.8="" calculate="" the="" correlation="" of="" ceo.data.subset="" using="" the="" function="" cor(ceo.data.subset,="" use="complete.obs" ).="" discuss="" the="" results,="" in="" particular="" having="" in="" mind="" the="" goal="" of="" finding="" factors="" to="" explain="" tdc1="" (copy="" the="" correlation="" table="" to="" your="" word="" document)="" q4.="" (20%)="" do="" a="" scatter="" plot="" between="" tdc1="" in="" the="" y-axis="" and="" another="" variable="" of="" your="" choice="" in="" the="" x-axis.="" customize="" the="" graph="" by="" 1)="" labelling="" the="" x="" and="" y="" axis="" with="" a="" short="" description="" of="" the="" variable="" (rather="" than="" the="" default="" ceo.data$tdc1),="" 2)="" change="" the="" color="" of="" the="" dots="" to="" red,="" and="" 3)="" the="" size="" of="" the="" dots="" with="" argument="" cex="0.5." discuss="" the="" scatter="" plot="" in="" terms="" of="" the="" dependence="" between="" the="" x="" and="" y="" variables="" and="" the="" existence="" and="" effect="" of="" outliers1="" in="" the="" variables="" considered.="" (copy/paste="" the="" plot="" to="" the="" word="" document)="" q5.="" (20%)="" test="" the="" null="" hypothesis="" that="" the="" population="" growth="" rate="" of="" sales="" (salechg)="" is="" equal="" to="" zero="" against="" the="" alternative="" that="" is="" different="" from="" zero="" at="" 5%="" significance="" level.="" provide="" the="" calculation="" and="" discuss="" what="" you="" learn="" from="" testing="" this="" hypothesis.="" 1an="" outlier="" is="" an="" observation="" that="" is="" many="" (at="" least="" 4)="" standard="" deviations="" away="" from="" the="" mean.="" 2="" assignment="" 1:="" ceo="">
Answered 9 days AfterOct 02, 2021

Answer To: DCT 90102: Econometric Methods for Business Research I DCT 90102: Econometric Methods for Business...

Atreye answered on Oct 12 2021
114 Votes
Solution 1:
Code:
> max=max(data$TDC1)
> min=min(data$TDC1)
> mean=mean(data$TDC1)
> median=median(data$TDC1)
> skew=skew(dat
a$TDC1)
> kurt=kurtosi(data$TDC1)
> stdev=sd(data$TDC1)
> statistic<-c("max","min","mean","median","skew","kurt","stdev")
> value<-c(max,min,mean,median,skew,kurt,stdev)
> df<-data.frame(statistic,value)
>df
Output:
statistic value
1 max 98012.34400
2 min 0.00000
3 mean 6845.41715
4 median 5026.59500
5 skew 4.51064
6 kurt 43.69817
7 stdev 6685.18051

Interpretation:
The maximum value of the data is 98012.34400 where the minimum value of the dataset is 0. The mean is 6845.41715 where the median of the dataset is 5026.59500. The standard deviation as the measure of dispersion is 6685.18051. The skewness is 4.51064 which is greater than +1 implying the data is right skewed. The kurtosis is 43.69817 which implies that the data is leptokurtic.
Solution 2:
Code:
>library(ggplot2)
>ggplot(data) +
+ geom_histogram(aes(x = TDC1, y = ..density..),
+ bins = 50,
+ fill = "tomato2") +
+ stat_function(fun = dnorm,
+ args = list(mean = mean(data$TDC1),
+ sd = sd(data$TDC1)),
+ ...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here