Instructions
1. The submission must be typed in a Microsoft WORD document and NOT handwritten.
2. Upload BOTH the Excel (working) file as well as the final Word/PDF file
4. DO NOT VIOLATE the honor code. Your submission MUST BE your OWN work.
5. Early submission is welcome and strongly encouraged.
Part I
Imagine that you are the CEO of a company that sells construction equipment and building materials
all over the country. The operations of the company are divided in 4 regions. Each region is headed
y a regional VP. Each region is divided in multiple te
itories. Each te
itory has its own sales team.
Each region has 30 or more sales teams. Each te
itorial sales team is headed by a sales manager. All
sales managers report to the co
esponding regional VP. All regional VPs report directly to you.
The numbers in the table below represent a sample of net incomes (in millions of US Dollars)
generated by the individual te
itorial sales teams during the last quarter.
Region A XXXXXXXXXXRegion B XXXXXXXXXXRegion C Region D
13.96 XXXXXXXXXX1 XXXXXXXXXX XXXXXXXXXX.54
16.07 XXXXXXXXXX3 XXXXXXXXXX XXXXXXXXXX.86
7.82 XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX39.13
16.58 XXXXXXXXXX0. XXXXXXXXXX0 XXXXXXXXXX2.17
7.93 XXXXXXXXXX16 XXXXXXXXXX XXXXXXXXXX10.37
4.41 XXXXXXXXXX15. XXXXXXXXXX7 XXXXXXXXXX2.17
XXXXXXXXXX22.47 XXXXXXXXXX15.94
XXXXXXXXXX13.04
Regional VPs enjoy significant autonomy. They can choose their own sales managers and sales
managers can choose their own team members. As a CEO you want to compare between the regions
and figure out as to which region is doing better than the others.
With this background, answer the following questions:
1. Find the 95% confidence interval of team sales in Region A.
2. Find the 95% confidence interval of team sales in Region B.
3. Find the 95% confidence interval of team sales in Region C.
4. Find the 95% confidence interval of team sales in Region D.
5. Find the 95% confidence interval of the difference of team sales between Regions A and B.
6. Find the 95% confidence interval of the difference of team sales between Regions A and C.
7. Find the 95% confidence interval of the difference of team sales between Regions A and D.
8. Find the 95% confidence interval of the difference of team sales between Regions B and C.
9. Find the 95% confidence interval of the difference of team sales between Regions B and D.
10. Find the 95% confidence interval of the difference of team sales between Regions C and D.
11. Test the hypothesis that the average team sales are equal between Regions A and B.
12. Test the hypothesis that the average team sales are equal between Regions A and C.
13. Test the hypothesis that the average team sales are equal between Regions A and D.
14. Test the hypothesis that the average team sales are equal between Regions B and C.
15. Test the hypothesis that the average team sales are equal between Regions B and D.
16. Test the hypothesis that the average team sales are equal between Regions C and D
17. Test the hypothesis that the variances of team sales are equal between Regions A and B.
18. Test the hypothesis that the variances of team sales are equal between Regions A and C.
19. Test the hypothesis that the variances of team sales are equal between Regions A and D.
20. Test the hypothesis that the variances of team sales are equal between Regions B and C.
21. Test the hypothesis that the variances of team sales are equal between Regions B and D.
22. Test the hypothesis that the variances of team sales are equal between Regions C and D.
23. Regional VPs always claim that there is no statistical difference between team sales between
egions. Perform an analysis to test their claim.
24. Given the answers to the questions above, what conclusion would you draw as a CEO regarding
the performance of various sales teams in different regions. What message would you have fo
the (a) regionals VPs, (b) the Board, (c) company shareholders? Carefully prepare an executive
eport of no more than 4-5 pages summarizing your observations backed by data and statistical
analysis. Please note that this report should be intelligible to a general audience member who
may not have a good background in statistics and quantitative methods.
For all questions above, assume that Probability of Type I E
or α 0.05
Part II
The attached file state-health.xlsx contains data on all US states for the following variables:
1. medinc: Median Annual Household Income
2. stax: State Collections of all taxes per Capita
3. leab: Life Expectancy at Birth (years)
4. phys: Number of physicians per 100,000 population
5. uins: Percentage of population who are uninsured
The data is collected from the https:
www.kff.org/statedata/ on November 1, 2021.
Using the data estimate the following regression models and carefully explain all the results.
a. leab 0 0 (medinc )
. ln(leab) 0 1 [ln(medinc)]
c. leab=0 1 (medinc ) 2 (medinc2 )
d. leab 0 1 (medinc ) 2(medinc XXXXXXXXXXphys)
e. leab 0 1 (medinc ) 2(medinc XXXXXXXXXXphys) +4(uins)
f. leab 0 1 (medinc ) 2(medinc XXXXXXXXXXphys) +4(uins) 5 (stax )
g. stax= 0 1 [ln(medinc)]
h. In(stax) 0 1 [ln(medinc)]
Augment the data by introducing a new column that contains the regional designation for a state.
The designations may be found athttps:
www2.census.gov/geo/pdfs/maps-data/maps
eference/us_regdiv.pdf
Perform appropriate ANOVA to test the following
a) Average per capita state tax collection is the same in all census regions.
b) Average life expectancy is the same in all census regions.
c) Average number of physicians per 100,000 population is the same in all census regions.
d) Average percentage of population who are uninsured is the same in all census regions.
Write an essay (no more than two pages) analyzing health disparities (as the differences in life
expectancies at birth) and their drivers across various states and regions in the USA.