# Integrative Learning Project – Part 2 and 3 Please read the following instructions carefully. This part of the project is overall worth 108 points. You will be graded for following directions as well....

Integrative Learning Project – Part 2 and 3
This part of the project is overall worth 108 points.
You will be graded for following directions as well.
If you do not format your work properly, it might not be graded until you format it correctly. Your project will be marked late.
· Create one File
· At the top of the file make sure that your name appears.
· Copy and paste/ or re-write the questions below in your file.
· Bold the text of your questions.
· Answer each question in a separate paragraph right after the question.
· Make sure to include proper notation when providing statistics/parameters in your answers.
· To copy a graph from StatKey or Excel use the Print Screen command (it is easiest to use StatKey for this assignment)
(google how to do this, if you do not know how or make an appointment with me)
DATASET
The Census Bureau groups data on households into census tracts. Census tracts should be divided so that the households in the census tracts share certain characteristics, such as economic status. Even though most tracts are fairly uniform in size, there is some variability, which is the focus of this lab.
This data set refers to the Census data collected back in 2010.
A sample of size 55 tracts has been selected from all tracts across the USA. Please use this Original Sample to answer all of the questions below.
See the separate csv file shared with you.
Part 2 - Inferential Statistics – Building Confidence Intervals
Using the data set provided (or a data set of your choice including at least 50 cases, provided that you have acquired prior approval from your professor to use this data set of your choice), follow the following directions and reply to each question below.
1. Define your population of interest:
(what does the data represent)
2. Why is it important for you to have a random sample from the population of interest?
3. Using StatKey (or Excel) find the original random sample’s mean (use proper notation)
4. Using StatKey (or Excel) find the original and representative sample’s standard deviation (use proper notation)
5. Using this representative sample you have selected use StateKey to create three Bootstrap samples and find for each Bootstrap sample the mean. Then read the standard error associated with the corresponding Bootstrap Sampling Distribution created by three Bootstrap Samples.
Record your data for the first three Bootstrap samples you select, in the table below
Bootstrap sample
Bootstrap sample
size
Bootstrap Sample
statistic
Bootstrap Sample
measure of variability
Bootstrap Sampling Distribution measure of variability
Number 1
1000

Number 2
2000

Number 3
3000

6. Now, still using StateKey create a Bootstrap Sampling distribution of at least 3003 samples
Take a picture of your distribution and paste it in this file (attach here ) – Label your picture clearly.
7. Describe this Bootstrap Sampling Distribution – Shape, center, and variability given to you by StatKey Use proper notation.

Symbol (if applicable)
Measure
Shape of the distribution:
Bell-shaped
n.a.
Center of the Distribution
mean
XXXXXXXXXX
Variability

8. What does each dot in the Bootstrap Sampling Distribution represent?
9. What is the 90% Confidence Interval given this Bootstrap Sampling Distribution?
(Please paste the image of the 90% Confidence Interval here)
10. What is the 95% Confidence Interval given this Bootstrap Sampling Distribution?
(Please paste the image of the 95% Confidence Interval here)
11. Interpret the 95% Confidence Interval – what does it represent in context?
Please explain with a full sentence what the 95% C.I. represents concretely.
The interpretation of this confidence interval is that we are 95% confident that the sample is between XXXXXXXXXXand XXXXXXXXXX.
12. If you wanted to use a Theoretical Statistical Distribution (instead of a Bootstrap Sampling Distribution) to identify the 95% Confidence Interval, what assumptions would you need to check to be able to do so?
State these assumptions then show if they are met in your case.
The dependent variable is assessed using a scale measure.
The participants are randomly selected.
The distribution of the population of interest must be approximately normal
13. Using the Theoretical Statistical Distribution, find the 95 % Confidence Interval following these steps:
a) Identify the Test Statistic you would use in this case and explain why:
b) Find the Standard Error: (show the formula you used and indicate your final answer using proper notation)
c) Find the Margin of Error for your 95% Confidence Interval (show the formula you used and indicate your final answer using proper notation- do not use the rule of thumb)
d) Find the 95% Confidence interval: (show the formula you used and indicate your final answer using proper notation)
14. Assuming that the true Population mean is equal to approximately 4200, with a standard deviation of about 1980 do your Confidence Intervals provide a realistic estimate for the true population mean in both the Bootstrapping and Theoretical approaches or not? Explain
Extra credit: (up to 3 points)
Let’s assume that you wanted the margin of error associated with your 95% confidence Interval to be no more than 250, how large would your sample size need to be in this case?
(show the formula you used and indicate your final answer using proper notation. Show all your work)
Part 3 – Inference – Building an Hypothesis test
If you have chosen to use your own data set, please state the question you would like to find an answer to, with your Hypothesis Test:
If you are using the data provided to you, then we are going to run an Hypothesis Test to try to answer the following question: Test to see if your random sample of cases provides evidence that the true average is greater than 6500.
Starting from our representative sample above follow the steps to test an hypothesis and answer the questions below:
15. Write your Null and Alternative Hypothesis
16. What is your original sample statistic ?
(Use proper notation and explain it in words)
17. Using the original sample you have selected use StateKey to create three Randomization samples (one at a time)
Record your data for these first three Randomization samples you created, in the table below
Randomization sample
Randomization sample
size
Randomization Sample
statistic
Randomization Sample
measure of variability
Randomization Sampling Distribution measure of variability
Number 1

Number 2

Number 3

18. Using StateKey create a Randomization sampling distribution of at least 3003 samples
Take a picture of your distribution and paste it in this file (attach here ) – Label your picture clearly.
19. Describe this randomization distribution of 3003 samples – Shape, center, and variability given to you by StatKey Use proper notation. (Fill out the table below)

Symbol (if applicable)
Measure
Shape of the distribution:

n.a.
Center of the Distribution

Variability

20. Compare and contrast your Randomization Sampling Distribution to your Bootstrap Sampling Distribution
21. Based on your original sample data and the randomization distribution that you have, how likely is your original sample result expected to be?
(put an x along this continuum to indicate your consideration )
Impossible somewhat possiblevery possiblecertain
22. Paste another picture of your Randomization Distribution here and mark your original sample data on the distribution with a Big Star:
23. What kind of test are you performing? (Circle one)
Left TailTwo TailsRight Tail
24. Paste a picture of your Randomization Distribution which confirms your answer above and highlights the test you are running in red.
25. What is the p-value associated with this test?
26. At a significance level of 5%, what is your statistical conclusion in this case?
28. If you wanted to use a Theoretical Statistical distribution (instead of a Randomization Sampling Distribution) to test your hypothesis, what assumptions would you need to check to be able to do so?
State these assumptions then show if they are met in your case.
29. Are these assumptions any different than those you checked when building a Confidence Interval? (circle one)
YesNo
Using this approach, test your hypothesis following these steps:
30. Find the Standard Error:
(show the formula you used and indicate your final answer using proper notation. Show all work)
31. Based on the context of the question we are investigating and the nature of our sample data, indicate what is the most appropriate test statistic to use in this case and explain why:
32. Calculate your test statistic in this case: :
(show the formula you used and indicate your final answer using proper notation. Show all your work)
33. Using StatKey paste an image of the Theoretical Distribution you are using here and the associated location of your test statistic with a Big Red Star
34. What is the p-value associated with your original sample data?
35. Using a significance level of 5%, state clearly if you are rejecting the null hypothesis or failing to reject the null hypothesis. (Circle one.)
Reject the null hypothesis Fail to reject the null hypothesis
36. Explain your conclusion in context for a non-statistician
37. Is your conclusion using this second method different than the conclusion you found using the randomization distribution? Explain:
For up to 3 points of Extra Credit:
Is your result supported by what you found in Part 2 of this project? Why or why not?
This Project is due May 4th.
Please follow the formatting guidelines stated at the top of the assignment.
2