1 REQUIRED FINAL DATA ANALYSIS PROJECT Assignment Spring 2021, Statistics 285: Section 06 GRAND TOTAL: XXXXXXXXXXpoints___ FINAL Data Analysis Project/Methods Paper XXXXXXXXXXDue Date: Friday, May 7...

1 answer below »

I have attached all the files, look at the file called final required it has all the requierments and it explains the project

1 REQUIRED FINAL DATA ANALYSIS PROJECT Assignment Spring 2021, Statistics 285: Section 06 GRAND TOTAL: 150 points___ FINAL Data Analysis Project/Methods Paper Due Date: Friday, May 7 2021 Requirements – Compile a data set of 20 points (observations, cases, study subjects, etc.) or more with two (2) quantitative variables (2 variables are needed for probability calculations and regression analysis). Goal: Apply Methods learned from Chapters 2 through 8 to a data set gathered or provided by a reliable source (Examples: https://www.kaggle.com/datasets ; www.census.gov; www.cdc.gov; etc. please not Wikipedia! Thank you!) and write an explanation of the results. (Length:7 -10 pages narrative; Tables-Charts-Figures separate) Final Paper Due: Friday, May 7, 2021 *****SUBMIT TO ASSIGNMENTS on CANVAS INCLUDE SEPARATE COVER PAGE TO INCLUDE: PROJECT TITLE Submitted to: Lynn A. Agre, MPH, PhD Statistics 285:Section 06 May 7, 2021 by: Your Name Rutgers Student ID No. ***REMEMBER to INSERT PAGE NUMBERS IN FINAL PAPER*** I. Abstract n = 150 words - brief description of proposed study-5 sentences. (Total 15 points) Example Lead Sentence for the Abstract or Summary: This study will investigate the relationship between x and y variables. This research will explore how x affects y. This analysis will explore the association between x and y. II. Introduction/Background: Brief (at least 2-3 pages) (25 points) a. State the problem based on preponderance of evidence (at least 3 literature citations within the last 5 years). Literature Review - summarized in two or three paragraphs (reliable sources located on https://www.libraries.rutgers.edu/indexes/google_scholar Use RU Net ID and log-in to access articles for free i.e. Get it @ R. b. Who or what is the unit of analysis -- what is the topic being studied. c. Where are the data collected from, including the source website? d. When were the data collected--what time frame, i.e. year, months. e. Hypotheses: – quantify by setting a threshold value, i.e. >, <, = note: formulate at least 3 hypotheses, with ho (null) and ha: (alternative) stated. https://www.kaggle.com/datasets http://www.census.gov/ http://www.cdc.gov/ https://www.libraries.rutgers.edu/indexes/google_scholar 2 iii. methods - i.e. data analysis portion (100 points) describe the data -- i.e. who or what is unit of analysis--what is being compared (such as us states and then the two variables, such as percent poverty, percent unemployed by us state). provide the data table. ( 4 points) ➢ please note: need to include all formulas used. formulas can be typed or cut and paste from the course slides, or web-source, if correct. step-by-step calculations not required, but helpful. ➢ please do not insert pictures of hand-written formulas and calculations in the word document. ➢ no credit will be given if not legible. thank you. from chapter 2: (10 points)-methods for describing data: calculate all for both variables: • mean • standard deviation • range • inter-quartile range, lower and upper limit values • create box plot (can use online boxplot or draw by hand in excel) • create histogram. from chapter 3 : (20 points) – probability - analyze the data set with at 4 types of probability methods. • complementary rule • additive rule – for mutually exclusive events • conditional rule • multiplicative rule • bayes rule from chapters 4: (choose 4: 20 points) – random variables and probability distribution • discrete random variables, mean and standard deviation • binomial distribution • poisson distribution • hypergeometric • percentiles, z-scores from chapter 5: (10 points)- sampling distribution • calculate the sampling distribution based on the mean • compute the sampling distribution of the sample proportion from chapter 6: (10 points) confidence intervals for single sample • calculate the population mean, the confidence interval either sigma ? known or unknown • apply either the normal z statistic or t-interval procedure (based on sample size and distribution of data) • calculate the necessary sample size 3 from chapter 7: (10 points) – hypothesis testing for single sample • formulate hypothesis test for μ and illustrate rejection region • apply hypothesis test either using normal z statistic student's t-statistic from chapter 8: (20 points) – confidence intervals for two samples • compute the standard deviation and confidence interval for two independent samples (will need a second data set of n = 20) • calculate either a one-tailed or two-tailed hypothesis test • calculate the paired difference confidence interval for µd = μ1 - μ2 • use the large sample test of hypothesis about p1 - p2 normal z statistic extra credit: from chapter 11 : simple linear regression (choose 2 methods: 10 points) • fit the model: the least squares approach y = mx + b or y = b0 + b1x1 • compute coefficient of determination or strength of linear relation with r2 iv. results (2-3 pages) 25 points) a. describe findings from analyses. b. are the two variables related? c. what are the significant results--what are not significant results. v. discussion or interpretation of the results (2-3 pages) (25 points) a. why is the research question important? b. how do these statistical methods answer your research question? how do these methods illustrate or explain the objective of your paper? c. what are the limitations of the study? what other variables can be included? d. what is the broader impact of the study? who can benefit from this research? e. what future studies can be generated from the results of your research? research investigators consult the existing body of knowledge or conduct a literature search first when developing and testing hypotheses, to extend, replicate and build on existing empirical research (i.e. based on hypothesis testing) — either quantitative or qualitative. vi. references – bibliography (6-10 references at least 50% from last five years) (20 points) citation format: author last name, first initial (year). article title. journal or newspaper, vol no., page numbers. please do not copy and past the url (i.e. the web address for the citation). a citation in that format will not receive credit or any points. example below: picket, k.e., wilkinson, r.g. (2015). income inequality and health: a causal review. social science and medicine, 128, 316-326. 4 types of articles for possible citation are: a. citations on the topic being studied; b. scholarly articles discussing application of statistical method chosen to test your hypothesis. make sure to address the following questions in your report, i.e. who, what, where, when and why, and how. (1) who or what is the unit of analysis? (2) where are the data collected from, including the source website? (3) when were the data collected--what time frame, i.e. year, months? (4) why is the research question important? (5) how do these statistical methods answer your research questions? how do they illustrate or explain the objective of your paper? data analysis project - common questions how to create a random sample step no. 1 is to identify a data set--raw data that have not been analyzed. this is the population. • column no. 1 contains the unit of analysis, such as us states, countries, teams or players or perhaps individuals or persons. • column no. 2 contains quantitative variable no. 1, column no. 3 contains quantitative variable no. 2. the document entitled, "sample structure of a data set" the pdf file listed under the required final data analysis project folder under the canvas resources folder displays the format. kaggle.com is one source for data--this would be considered population. step no. 2 to create a random sample from the data set selected. from the population of n =? whatever the maximum number is in the kaggle data set,, you can build a data set from the full list of us states, countries---using a random number generator available https://www.random.org ➢ insert the minimum number (i.e. 1) and then the maximum number of cases or in this case countries in the data set (i,e, 170 for example). ➢ the random number generator will provide a number between 1 and 170 perhaps 67. ➢ thus, the first random selection from the countries who are on the list is no. 67. ➢ then, enter country no. 67 in the spreadsheet with the two quantitative variables-- continuous variables which are suitable for the methods studied in the course literature review - finding articles ❖ the articles are published papers that are related to the variables you are analyzing in your data set. ❖ simply summarize the scientific results in those three papers--one paragraph for each paper. ❖ preponderance of evidence in science refers to the published research on a similar or related topic to the data set that is being analyzed. ❖ you can access relevant literature i.e. scholar.google.com accessible via libraries.rutgers.edu and search for articles (not personal web pages or blogs) that pertain to the data topic you are analyzing. see link below. https://www.libraries.rutgers.edu/indexes/google_scholar https://www.random.org/ creating hypotheses ✓ the hypotheses then pertain to the data set that you are analyzing. ✓ each method, each formula studied in the course uses the ho-null and the ha- alternative hypothesis formulate to test the variable against a null value (such as a population mean. probability calculations • regarding probability calculations, first create a relative frequency table. • probability is based on f over n or x over n--referring to total n=20. • to determine the numerator, set a threshold, i..e count the number of states that have at least a certain threshold of depression rate out of 20 states total. • thus, if 5 =="" note:="" formulate="" at="" least="" 3="" hypotheses,="" with="" ho="" (null)="" and="" ha:="" (alternative)="" stated.="" https://www.kaggle.com/datasets="" http://www.census.gov/="" http://www.cdc.gov/="" https://www.libraries.rutgers.edu/indexes/google_scholar="" 2="" iii.="" methods="" -="" i.e.="" data="" analysis="" portion="" (100="" points)="" describe="" the="" data="" --="" i.e.="" who="" or="" what="" is="" unit="" of="" analysis--what="" is="" being="" compared="" (such="" as="" us="" states="" and="" then="" the="" two="" variables,="" such="" as="" percent="" poverty,="" percent="" unemployed="" by="" us="" state).="" provide="" the="" data="" table.="" (="" 4="" points)="" ➢="" please="" note:="" need="" to="" include="" all="" formulas="" used.="" formulas="" can="" be="" typed="" or="" cut="" and="" paste="" from="" the="" course="" slides,="" or="" web-source,="" if="" correct.="" step-by-step="" calculations="" not="" required,="" but="" helpful.="" ➢="" please="" do="" not="" insert="" pictures="" of="" hand-written="" formulas="" and="" calculations="" in="" the="" word="" document.="" ➢="" no="" credit="" will="" be="" given="" if="" not="" legible.="" thank="" you.="" from="" chapter="" 2:="" (10="" points)-methods="" for="" describing="" data:="" calculate="" all="" for="" both="" variables:="" •="" mean="" •="" standard="" deviation="" •="" range="" •="" inter-quartile="" range,="" lower="" and="" upper="" limit="" values="" •="" create="" box="" plot="" (can="" use="" online="" boxplot="" or="" draw="" by="" hand="" in="" excel)="" •="" create="" histogram.="" from="" chapter="" 3="" :="" (20="" points)="" –="" probability="" -="" analyze="" the="" data="" set="" with="" at="" 4="" types="" of="" probability="" methods.="" •="" complementary="" rule="" •="" additive="" rule="" –="" for="" mutually="" exclusive="" events="" •="" conditional="" rule="" •="" multiplicative="" rule="" •="" bayes="" rule="" from="" chapters="" 4:="" (choose="" 4:="" 20="" points)="" –="" random="" variables="" and="" probability="" distribution="" •="" discrete="" random="" variables,="" mean="" and="" standard="" deviation="" •="" binomial="" distribution="" •="" poisson="" distribution="" •="" hypergeometric="" •="" percentiles,="" z-scores="" from="" chapter="" 5:="" (10="" points)-="" sampling="" distribution="" •="" calculate="" the="" sampling="" distribution="" based="" on="" the="" mean="" •="" compute="" the="" sampling="" distribution="" of="" the="" sample="" proportion="" from="" chapter="" 6:="" (10="" points)="" confidence="" intervals="" for="" single="" sample="" •="" calculate="" the="" population="" mean,="" the="" confidence="" interval="" either="" sigma="" known="" or="" unknown="" •="" apply="" either="" the="" normal="" z="" statistic="" or="" t-interval="" procedure="" (based="" on="" sample="" size="" and="" distribution="" of="" data)="" •="" calculate="" the="" necessary="" sample="" size="" 3="" from="" chapter="" 7:="" (10="" points)="" –="" hypothesis="" testing="" for="" single="" sample="" •="" formulate="" hypothesis="" test="" for="" μ="" and="" illustrate="" rejection="" region="" •="" apply="" hypothesis="" test="" either="" using="" normal="" z="" statistic="" student's="" t-statistic="" from="" chapter="" 8:="" (20="" points)="" –="" confidence="" intervals="" for="" two="" samples="" •="" compute="" the="" standard="" deviation="" and="" confidence="" interval="" for="" two="" independent="" samples="" (will="" need="" a="" second="" data="" set="" of="" n="20)" •="" calculate="" either="" a="" one-tailed="" or="" two-tailed="" hypothesis="" test="" •="" calculate="" the="" paired="" difference="" confidence="" interval="" for="" µd="μ1" -="" μ2="" •="" use="" the="" large="" sample="" test="" of="" hypothesis="" about="" p1="" -="" p2="" normal="" z="" statistic="" extra="" credit:="" from="" chapter="" 11="" :="" simple="" linear="" regression="" (choose="" 2="" methods:="" 10="" points)="" •="" fit="" the="" model:="" the="" least="" squares="" approach="" y="mx" +="" b="" or="" y="B0" +="" b1x1="" •="" compute="" coefficient="" of="" determination="" or="" strength="" of="" linear="" relation="" with="" r2="" iv.="" results="" (2-3="" pages)="" 25="" points)="" a.="" describe="" findings="" from="" analyses.="" b.="" are="" the="" two="" variables="" related?="" c.="" what="" are="" the="" significant="" results--what="" are="" not="" significant="" results.="" v.="" discussion="" or="" interpretation="" of="" the="" results="" (2-3="" pages)="" (25="" points)="" a.="" why="" is="" the="" research="" question="" important?="" b.="" how="" do="" these="" statistical="" methods="" answer="" your="" research="" question?="" how="" do="" these="" methods="" illustrate="" or="" explain="" the="" objective="" of="" your="" paper?="" c.="" what="" are="" the="" limitations="" of="" the="" study?="" what="" other="" variables="" can="" be="" included?="" d.="" what="" is="" the="" broader="" impact="" of="" the="" study?="" who="" can="" benefit="" from="" this="" research?="" e.="" what="" future="" studies="" can="" be="" generated="" from="" the="" results="" of="" your="" research?="" research="" investigators="" consult="" the="" existing="" body="" of="" knowledge="" or="" conduct="" a="" literature="" search="" first="" when="" developing="" and="" testing="" hypotheses,="" to="" extend,="" replicate="" and="" build="" on="" existing="" empirical="" research="" (i.e.="" based="" on="" hypothesis="" testing)="" —="" either="" quantitative="" or="" qualitative.="" vi.="" references="" –="" bibliography="" (6-10="" references="" at="" least="" 50%="" from="" last="" five="" years)="" (20="" points)="" citation="" format:="" author="" last="" name,="" first="" initial="" (year).="" article="" title.="" journal="" or="" newspaper,="" vol="" no.,="" page="" numbers.="" please="" do="" not="" copy="" and="" past="" the="" url="" (i.e.="" the="" web="" address="" for="" the="" citation).="" a="" citation="" in="" that="" format="" will="" not="" receive="" credit="" or="" any="" points.="" example="" below:="" picket,="" k.e.,="" wilkinson,="" r.g.="" (2015).="" income="" inequality="" and="" health:="" a="" causal="" review.="" social="" science="" and="" medicine,="" 128,="" 316-326.="" 4="" types="" of="" articles="" for="" possible="" citation="" are:="" a.="" citations="" on="" the="" topic="" being="" studied;="" b.="" scholarly="" articles="" discussing="" application="" of="" statistical="" method="" chosen="" to="" test="" your="" hypothesis.="" make="" sure="" to="" address="" the="" following="" questions="" in="" your="" report,="" i.e.="" who,="" what,="" where,="" when="" and="" why,="" and="" how.="" (1)="" who="" or="" what="" is="" the="" unit="" of="" analysis?="" (2)="" where="" are="" the="" data="" collected="" from,="" including="" the="" source="" website?="" (3)="" when="" were="" the="" data="" collected--what="" time="" frame,="" i.e.="" year,="" months?="" (4)="" why="" is="" the="" research="" question="" important?="" (5)="" how="" do="" these="" statistical="" methods="" answer="" your="" research="" questions?="" how="" do="" they="" illustrate="" or="" explain="" the="" objective="" of="" your="" paper?="" data="" analysis="" project="" -="" common="" questions="" how="" to="" create="" a="" random="" sample="" step="" no.="" 1="" is="" to="" identify="" a="" data="" set--raw="" data="" that="" have="" not="" been="" analyzed.="" this="" is="" the="" population.="" •="" column="" no.="" 1="" contains="" the="" unit="" of="" analysis,="" such="" as="" us="" states,="" countries,="" teams="" or="" players="" or="" perhaps="" individuals="" or="" persons.="" •="" column="" no.="" 2="" contains="" quantitative="" variable="" no.="" 1,="" column="" no.="" 3="" contains="" quantitative="" variable="" no.="" 2.="" the="" document="" entitled,="" "sample="" structure="" of="" a="" data="" set"="" the="" pdf="" file="" listed="" under="" the="" required="" final="" data="" analysis="" project="" folder="" under="" the="" canvas="" resources="" folder="" displays="" the="" format.="" kaggle.com="" is="" one="" source="" for="" data--this="" would="" be="" considered="" population.="" step="" no.="" 2="" to="" create="" a="" random="" sample="" from="" the="" data="" set="" selected.="" from="" the="" population="" of="" n="?" whatever="" the="" maximum="" number="" is="" in="" the="" kaggle="" data="" set,,="" you="" can="" build="" a="" data="" set="" from="" the="" full="" list="" of="" us="" states,="" countries---using="" a="" random="" number="" generator="" available="" https://www.random.org="" ➢="" insert="" the="" minimum="" number="" (i.e.="" 1)="" and="" then="" the="" maximum="" number="" of="" cases="" or="" in="" this="" case="" countries="" in="" the="" data="" set="" (i,e,="" 170="" for="" example).="" ➢="" the="" random="" number="" generator="" will="" provide="" a="" number="" between="" 1="" and="" 170="" perhaps="" 67.="" ➢="" thus,="" the="" first="" random="" selection="" from="" the="" countries="" who="" are="" on="" the="" list="" is="" no.="" 67.="" ➢="" then,="" enter="" country="" no.="" 67="" in="" the="" spreadsheet="" with="" the="" two="" quantitative="" variables--="" continuous="" variables="" which="" are="" suitable="" for="" the="" methods="" studied="" in="" the="" course="" literature="" review="" -="" finding="" articles="" ❖="" the="" articles="" are="" published="" papers="" that="" are="" related="" to="" the="" variables="" you="" are="" analyzing="" in="" your="" data="" set.="" ❖="" simply="" summarize="" the="" scientific="" results="" in="" those="" three="" papers--one="" paragraph="" for="" each="" paper.="" ❖="" preponderance="" of="" evidence="" in="" science="" refers="" to="" the="" published="" research="" on="" a="" similar="" or="" related="" topic="" to="" the="" data="" set="" that="" is="" being="" analyzed.="" ❖="" you="" can="" access="" relevant="" literature="" i.e.="" scholar.google.com="" accessible="" via="" libraries.rutgers.edu="" and="" search="" for="" articles="" (not="" personal="" web="" pages="" or="" blogs)="" that="" pertain="" to="" the="" data="" topic="" you="" are="" analyzing.="" see="" link="" below.="" https://www.libraries.rutgers.edu/indexes/google_scholar="" https://www.random.org/="" creating="" hypotheses="" ✓="" the="" hypotheses="" then="" pertain="" to="" the="" data="" set="" that="" you="" are="" analyzing.="" ✓="" each="" method,="" each="" formula="" studied="" in="" the="" course="" uses="" the="" ho-null="" and="" the="" ha-="" alternative="" hypothesis="" formulate="" to="" test="" the="" variable="" against="" a="" null="" value="" (such="" as="" a="" population="" mean.="" probability="" calculations="" •="" regarding="" probability="" calculations,="" first="" create="" a="" relative="" frequency="" table.="" •="" probability="" is="" based="" on="" f="" over="" n="" or="" x="" over="" n--referring="" to="" total="" n="20." •="" to="" determine="" the="" numerator,="" set="" a="" threshold,="" i..e="" count="" the="" number="" of="" states="" that="" have="" at="" least="" a="" certain="" threshold="" of="" depression="" rate="" out="" of="" 20="" states="" total.="" •="" thus,="" if="">

final-required-data-analysi-project-stat-285spring2020ver103-31-2021-sv0cd3ek.pdf final-data-analysis-project-common-questions-vzs0n4tf.pdf data-sets-and-us-government-websites-offering-data-0dlxb0og.pdf data-set-sample-drawn-from-a-population-spring-2021-5i0nhicf.pdf data-set-how-to-create03-31-2021-rhb51eak.pdf

Answered 8 days AfterApr 11, 2021

Answer To: 1 REQUIRED FINAL DATA ANALYSIS PROJECT Assignment Spring 2021, Statistics 285: Section 06 GRAND...

Naveen answered on Apr 20 2021

152 Votes

Covid Analysis
Abstract:
This study is mainly concentrated on the corona virus which is changed the entire human life to different. This analysis will help us to know the impact of the corona virus on the people and how this affecting the death of people. This is important study because of the major countries facing difficulty to tackle the corona virus though they have the advance medical equipment. It is changing its characteristics step by step and spreading to the people by physical gatherings and also by the air.
    The study will help to take the proper decisions about the precautions followed by the people and also government policies against the virus. This will help also to find trend of the virus case and deaths because of virus. So, this study is very helpful to the governments as well as people. Now it is in second stage which is danger than the first stage. It is spreading rapidly.
Introduction:
    The major problem where the entire world is facing that the medical emergency that of because corona virus. It decreases the every country economy and most of the people are un- employed also people are struggled very much because of lock down where the countries taken the action against the corona virus. Here we have collected the data from the official website kaggle [https://www.kaggle.com/imdevskp/corona-virus-report]. The data is from the January 2020 to up to July 2020.
    The people are getting the virus that is corona some of the people are not able to face the virus so they are dying. Because some of the people having already medical problems that is long life diseases related to breathing. The people are having the disease related to breathing they are very likely to be die if the corona virus attacked. And also people need to be take proper medication for the corona. Although taking the medication to the corona the aged people that age greater than 50 are likely to die but we can take the decision if the virus attacked the people having the age greater than 50 likely to be die because there is a people who are still recovered from the virus.
     All the government aim is to decrease the death rate those are attacked with corona virus. To make decision relation between the number of cases confirmed and are died we use some of the statistical analysis. By using the analysis we can do recommend the future actions to the countries or states. The WHO also advising that to be decrease the number of deaths by tracing the patients as early as possible because it is spreading virus though cough or hand touch that is physical touch. It is very crucial situation to all countries to take action against the problem but in some of the countries people are not supporting to the countries and they are doing rallies against the government of not to take the lockdown.
    Here the main reason is that people are not taking the precautions that is wearing mask, social distance and using the sanitizer. The wave of the corona decreased on that basis the people are not taking the precautions so, only now the corona increasing rapidly. The vaccine for the virus is not available in 1 year. But now the vaccine is available but there are some different myths in the people because for some people this is working but in the rare cases people are dying after taking the vaccine. The economically weaker countries not have the vaccine the developed countries need to be sent the vaccine to those countries those are not having.
    The developed countries like United States of America, china and Russia needs to send the vaccine to the poor countries to decrease the number of deaths among the patients. The front line workers that is doctors also need to give their efforts two wards the virus. Many of the researchers are doing research on the virus to get the better vaccine to the corona virus.
    The countries are very affected economically because of the pandemic of corona virus. Up to now more than of one Crore people are died because of the virus. The countries also need to take action like stop the people gathering that is major issue to spread the virus.
     Here I have downloaded the data from the kaggle of the 187 countries. The unit of analysis is the countries the topic is corona cases. The data is collected from [https://www.kaggle.com/imdevskp/corona-virus-report]. The data is daily data which is recorded the cases, deaths, new cases, new deaths and recovery on daily basis of each country.
Literature Review:
    The corona virus is not come from 2019 but it is already identified in 1960 according to the report of Canadian. That time the people are suffered from the corona virus of 17 out of 500 patients who are have disease. It was consider as a simple virus and need not any medication research on it until 2003. But mean time some of the countries like united states, Singapore, Thailand etc., are published that corona virus is spreading rapidly. So, the virus is affected many of the people in the later years.
    There are four types of in it that is Alpha, beta, gamma and delta. Some of the...

SOLUTION.PDF

1 REQUIRED FINAL DATA ANALYSIS PROJECT Assignment Spring 2021, Statistics 285: Section 06 GRAND TOTAL: XXXXXXXXXXpoints___ FINAL Data Analysis Project/Methods Paper XXXXXXXXXXDue Date: Friday, May 7...

Answer To: 1 REQUIRED FINAL DATA ANALYSIS PROJECT Assignment Spring 2021, Statistics 285: Section 06 GRAND...

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment