was asked to create a random sample from the original sample and I've done that (labeled random...

Question

was asked to create a random sample from the original sample and I've done that (labeled random sample)

Integrative Learning Project – Part 2 and 3 Please read the following instructions carefully. This part of the project is overall worth 108 points. You will be graded for following directions as well. If you do not format your work properly, it might not be graded until you format it correctly. Your project will be marked late. · Create one File · At the top of the file make sure that your name appears. · Copy and paste/ or re-write the questions below in your file. · Bold the text of your questions. · Answer each question in a separate paragraph right after the question. · Do NOT Bold your answers. · Make sure your answers are written as complete sentences. · Make sure to include proper notation when providing statistics/parameters in your answers. · To copy a graph from StatKey or Excel use the Print Screen command (it is easiest to use StatKey for this assignment) (google how to do this, if you do not know how or make an appointment with me) DATASET The Census Bureau groups data on households into census tracts. Census tracts should be divided so that the households in the census tracts share certain characteristics, such as economic status. Even though most tracts are fairly uniform in size, there is some variability, which is the focus of this lab. This data set refers to the Census data collected back in 2010. A sample of size 55 tracts has been selected from all tracts across the USA. Please use this Original Sample to answer all of the questions below. See the separate csv file shared with you. Part 2 - Inferential Statistics – Building Confidence Intervals Using the data set provided (or a data set of your choice including at least 50 cases, provided that you have acquired prior approval from your professor to use this data set of your choice), follow the following directions and reply to each question below. Please make sure to always use proper statistical notation in your answers. 1. Define your population of interest: (what does the data represent) 2. Why is it important for you to have a random sample from the population of interest? 3. Using StatKey (or Excel) find the original random sample’s mean (use proper notation) Answer: 3052.364 4. Using StatKey (or Excel) find the original and representative sample’s standard deviation (use proper notation) Answer:578.416 5. Using this representative sample you have selected use StateKey to create three Bootstrap samples and find for each Bootstrap sample the mean. Then read the standard error associated with the corresponding Bootstrap Sampling Distribution created by three Bootstrap Samples. Record your data for the first three Bootstrap samples you select, in the table below Bootstrap sample Bootstrap sample size Bootstrap Sample statistic Bootstrap Sample measure of variability Bootstrap Sampling Distribution measure of variability Number 1 1000 Number 2 2000 Number 3 3000 6. Now, still using StateKey create a Bootstrap Sampling distribution of at least 3003 samples Take a picture of your distribution and paste it in this file (attach here ) – Label your picture clearly. 7. Describe this Bootstrap Sampling Distribution – Shape, center, and variability given to you by StatKey Use proper notation. Symbol (if applicable) Measure Shape of the distribution: Bell-shaped n.a. Center of the Distribution mean 3055.351 Variability 8. What does each dot in the Bootstrap Sampling Distribution represent? Answer: bootstrap sample statistic 9. What is the 90% Confidence Interval given this Bootstrap Sampling Distribution? (Please paste the image of the 90% Confidence Interval here) 10. What is the 95% Confidence Interval given this Bootstrap Sampling Distribution? (Please paste the image of the 95% Confidence Interval here) 11. Interpret the 95% Confidence Interval – what does it represent in context? Please explain with a full sentence what the 95% C.I. represents concretely. The interpretation of this confidence interval is that we are 95% confident that the sample is between 2900.152 and 3207.433. 12. If you wanted to use a Theoretical Statistical Distribution (instead of a Bootstrap Sampling Distribution) to identify the 95% Confidence Interval, what assumptions would you need to check to be able to do so? State these assumptions then show if they are met in your case. Answer: the degree of freedom. The dependent variable is assessed using a scale measure. The participants are randomly selected. The distribution of the population of interest must be approximately normal 13. Using the Theoretical Statistical Distribution, find the 95 % Confidence Interval following these steps: a) Identify the Test Statistic you would use in this case and explain why: b) Find the Standard Error: (show the formula you used and indicate your final answer using proper notation) c) Find the Margin of Error for your 95% Confidence Interval (show the formula you used and indicate your final answer using proper notation- do not use the rule of thumb) d) Find the 95% Confidence interval: (show the formula you used and indicate your final answer using proper notation) 14. Assuming that the true Population mean is equal to approximately 4200, with a standard deviation of about 1980 do your Confidence Intervals provide a realistic estimate for the true population mean in both the Bootstrapping and Theoretical approaches or not? Explain Extra credit: (up to 3 points) Let’s assume that you wanted the margin of error associated with your 95% confidence Interval to be no more than 250, how large would your sample size need to be in this case? (show the formula you used and indicate your final answer using proper notation. Show all your work) Part 3 – Inference – Building an Hypothesis test If you have chosen to use your own data set, please state the question you would like to find an answer to, with your Hypothesis Test: If you are using the data provided to you, then we are going to run an Hypothesis Test to try to answer the following question: Test to see if your random sample of cases provides evidence that the true average is greater than 6500. Starting from our representative sample above follow the steps to test an hypothesis and answer the questions below: 15. Write your Null and Alternative Hypothesis 16. What is your original sample statistic ? (Use proper notation and explain it in words) 17. Using the original sample you have selected use StateKey to create three Randomization samples (one at a time) Record your data for these first three Randomization samples you created, in the table below Randomization sample Randomization sample size Randomization Sample statistic Randomization Sample measure of variability Randomization Sampling Distribution measure of variability Number 1 Number 2 Number 3 18. Using StateKey create a Randomization sampling distribution of at least 3003 samples Take a picture of your distribution and paste it in this file (attach here ) – Label your picture clearly. 19. Describe this randomization distribution of 3003 samples – Shape, center, and variability given to you by StatKey Use proper notation. (Fill out the table below) Symbol (if applicable) Measure Shape of the distribution: n.a. Center of the Distribution Variability 20. Compare and contrast your Randomization Sampling Distribution to your Bootstrap Sampling Distribution 21. Based on your original sample data and the randomization distribution that you have, how likely is your original sample result expected to be? (put an x along this continuum to indicate your consideration ) Impossible somewhat possiblevery possiblecertain 22. Paste another picture of your Randomization Distribution here and mark your original sample data on the distribution with a Big Star: 23. What kind of test are you performing? (Circle one) Left TailTwo TailsRight Tail 24. Paste a picture of your Randomization Distribution which confirms your answer above and highlights the test you are running in red. 25. What is the p-value associated with this test? 26. At a significance level of 5%, what is your statistical conclusion in this case? 27. Please re-write your conclusion in context for a non-statistician: 28. If you wanted to use a Theoretical Statistical distribution (instead of a Randomization Sampling Distribution) to test your hypothesis, what assumptions would you need to check to be able to do so? State these assumptions then show if they are met in your case. 29. Are these assumptions any different than those you checked when building a Confidence Interval? (circle one) YesNo Using this approach, test your hypothesis following these steps: 30. Find the Standard Error: (show the formula you used and indicate your final answer using proper notation. Show all work) 31. Based on the context of the question we are investigating and the nature of our sample data, indicate what is the most appropriate test statistic to use in this case and explain why: 32. Calculate your test statistic in this case: : (show the formula you used and indicate your final answer using proper notation. Show all your work) 33. Using StatKey paste an image of the Theoretical Distribution you are using here and the associated location of your test statistic with a Big Red Star 34. What is the p-value associated with your original sample data? 35. Using a significance level of 5%, state clearly if you are rejecting the null hypothesis or failing to reject the null hypothesis. (Circle one.) Reject the null hypothesis Fail to reject the null hypothesis 36. Explain your conclusion in context for a non-statistician 37. Is your conclusion using this second method different than the conclusion you found using the randomization distribution? Explain: For up to 3 points of Extra Credit: Is your result supported by what you found in Part 2 of this project? Why or why not? This Project is due May 4th. Please submit your work in Blackboard as a pdf file Please follow the formatting guidelines stated at the top of the assignment. 2

project-part-2-and-3-guidelines-1-yod2ooqz.docx sampleoftractsfrom2010uscensus-jwjh42er.xlsx

Anu · Accepted Answer

Integrative Learning Project – Part 2 and 3 
Please read the following instructions carefully.
This part of the project is overall worth 108 points. 
You will be graded for following directions as well. 
If you do not format your work properly, it might not be graded until you format it correctly. Your project will be marked late.  
· Create one File 
· At the top of the file make sure that your name appears. 
· Copy and paste/ or re-write the questions below in your file.
· Bold the text of your questions. 
· Answer each question in a separate paragraph right after the question. 
· Do NOTBold your answers. 
· Make sure your answers are written as complete sentences.
· Make sure to include proper notation when providing statistics/parameters in your answers.
· To copy a graph from StatKeyor Excel use the Print Screen command (it is easiest to use StatKey for this assignment)
(google how to do this, if you do not know how or make an appointment with me) 
DATASET 
The Census Bureau groups data on households into census tracts. Census tracts should be divided so that the households in the census tracts share certain characteristics, such as economic status. Even though most tracts are fairly uniform in size, there is some variability, which is the focus of this lab. 
This data set refers to the Census data collected back in 2010. 
A sample of size 55 tracts has been selected from all tracts across the USA. Please use this Original Sample to answer all of the questions below. 
See the separate csv file shared with you. 
Part 2 - Inferential Statistics – Building Confidence Intervals
Using the data set provided (or a data set of your choice including at least 50 cases, provided that you have acquired prior approval from your professor to use this data set of your choice), follow the following directions and reply to each question below. 
Please make sure to always use proper statistical notation in your answers. 
1. Define your population of interest:
Answer:  Population is the tracts information of USA on the basis of census data back in 2010.
2. Why is it important for you to have a random sample from the population of interest? 
Answer:  Population size is very large, so it is typical to infer about the population. Therefore, on the basis of random sample we will infer about the population. 
3. Using StatKey (or Excel) find the original random sample’s mean  (use proper notation)
Answer:  2953.691
4. Using StatKey (or Excel)find the original and representative sample’s standard deviation (use proper notation)
Answer: 575.34
5. Using this representative sample you have selected use StateKey to create three Bootstrap samples and find for each Bootstrap sample the mean. Then read the standard error associated with the corresponding Bootstrap Sampling Distribution created by three Bootstrap Samples.

Integrative Learning Project – Part 2 and 3 Please read the following instructions carefully. This part of the project is overall worth 108 points. You will be graded for following directions as well....

Answer To: Integrative Learning Project – Part 2 and 3 Please read the following instructions carefully. This...

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment