1 Econ 378 Data Analysis Project Overview This project gives you hands-on experience summarizing and analyzing data of your own interest. You are welcome to use spreadsheet or statistical software...

1 answer below »
It's just the Data Analysis Project at the beginning. Parts 1-3


1 Econ 378 Data Analysis Project Overview This project gives you hands-on experience summarizing and analyzing data of your own interest. You are welcome to use spreadsheet or statistical software such as Excel or Stata. Some major statistical databases are listed on Learning Suite, and numerous data sources are available freely on the internet. It will be easy to find something that fits the parameters of the project, but I encourage you to find something that is important to you personally, to make the project much more meaningful.1 Feel free to consult with me or with other professors if you need help finding a specific type of data. Examples that you might find interesting include: • Price data (e.g. wages, interest rates, stock returns, home values, insurance premiums) • National statistics (e.g. GDP, employment, inflation, crime, tax rates) over time or across countries or states • Sales data from a business (with permission/confidentiality, as appropriate) • Health/sports/political statistics, opinion polls, or your own experimental research2 • Your own personal finances, time use, grades, etc. This class prepares you to answer questions such as: (1) On average, how big is variable ??? (2) How widely does ?? vary across observations? (3) Is variable ?? positive or negatively correlated with variable ??, and how strong is this relationship? (4) How can I use variable ?? to predict variable ??? To answer these questions, you will need at least two variables, but this will not be difficult. Additional variables may make the analysis more interesting, but you will only analyze two at a time. You can also analyze multiple variables using Econometrics (Econ 388), so keep your data. Part 1 – Data Collection & Summary (+35) 1 If you lack research ideas, imagine that you have a magic crystal ball that can answer any one question of your choice. What do you wish to ask? That question is your research topic. Next, suppose that you have to answer that question on your own, but that you can ask the crystal ball for any secondary facts that will aid you in answering your big question for yourself. What more specific questions will lead you eventually to the answers you had wanted? Continue this procedure until you reach a question that is sufficiently specific (albeit several steps removed from your original interest) that it becomes feasible to collect the relevant data and get to work. 2 If you collect data from human subjects, you must take care to preserve their safety and privacy, and ensure that participation is voluntary. If you wish to publish your data or results beyond this class, you will need advance approval from the BYU Internal Review Board, who monitor compliance with federal regulations (see http://orca.byu.edu/IRB/ for more details). Start early in that case, to leave time for the approval process, and request additional time if necessary. http://orca.byu.edu/IRB/ 2 1. (+15) Collect data of interest You do not need to submit your data files; just describe the data: If it is not obvious already, what exactly do the variables measure (e.g., what units)?3 How were they collected? Do you have data for the entire population of interest? Or just a sample? The first column of data should list the unit of observation (e.g. individual, firm, country, or time period). 4 For each observation, you need at least one quantitative variable (e.g. price, number of sales, age, GDP) and one binary variable (e.g. gender, race, industry, political party, sport position).5 While not required, it is often interesting to pull data from multiple sources, or to construct new variables from existing data.6 In the spreadsheet below, for example, government finance variables come from one source and a binary political variable comes from another. Per capita variables are then computed simply as ratios; growth variables are computed simply as differences (as a ratio of the original level); and additional binary variables are constructed either by reducing a quantitative variable into “high” and “low” categories (e.g. GDP growth above or below 1.5%) or by comparing two existing variables (e.g. Gov. growth > GDP growth?). Unit Original Variables Constructed Variables GDP Population Gov. Spending Republican House? Per capita GDP Per capita GDP growth GDP Growth > 1.5%? Per capita Gov. spending Per capita Gov. growth Gov. growth > GDP growth? ($ bil.) (mil.) ($ bil.) ($ thous.) (%) ($ thous.) (%) Year 2008 14,834 304 4,665 0 48.8 - - 15.3 - - 2009 14,418 307 5,179 0 47.0 -3.7% 0 16.9 10.1% 1 2010 14,779 309 5,057 0 47.8 1.7% 1 16.3 -3.2% 0 2011 15,052 312 5,116 1 48.3 1.1% 0 16.4 0.4% 0 2012 15,471 314 5,042 1 49.3 2.0% 1 16.1 -2.2% 0 2013 15,759 316 4,955 1 49.8 1.1% 0 15.7 -2.5% 0 2014 16,077 319 4,957 1 50.4 1.3% 0 15.5 -0.7% 0 2. (+3) Identify your audience Identify some audience that might find this data interesting: a policy maker, a business leader, a consumer, etc. In Part 2, you will report your findings to this individual. List any questions (at 3 For example, a humanitarian agency might rate sovereign governments as “corrupt” or not, and designate individuals as “in poverty” or not, but how are these categories assigned? What exactly do they mean? 4 You need at least three observations; larger samples increase precision. If you have trouble identifying the unit of observation, it may be that your data are actually a summary of more primitive raw data. If so, this may be unusable, as the number of observations is effectively reduced to one. 5 You can make a categorical variable binary simply by combining categories. For example, a “race” variable might have several codes for different races, but can be reduced simply to “white” and “minority”. You can also construct binary variables from quantitative variables (see below). 6 When the unit of observation is a time period (e.g. year or week), it can also double as a quantitative variable. 3 least two) that this audience might have, that you believe your data can shed (at least partial) light on. 3. (+6) Summarize individual variables a. Summarize at least one binary variable by reporting the total fraction in each category. b. Summarize at least one quantitative variable by reporting the minimum, maximum, mean, and standard deviation. c. Use one binary variable to divide your data into subgroups, and report the conditional minimum, conditional maximum, conditional mean, and conditional standard deviation for this subgroup (e.g. average wages among female workers). Note: for all subsequent analysis of this project, you may use the full sample or this restricted sample, as you wish. d. Represent at least one quantitative variable graphically, using a histogram.7 4. (+6) Correlation and causation Choose two variables, and do the following: a. Identify reasons why the variables might be positively or negatively correlated. Might one cause the other to increase or decrease? Is reverse causation possible? Are there outside factors that might cause both variables to move? Predict the sign and magnitude of the correlation coefficient ?? between these variables. b. For any outside factors that you identify in part a, tell what additional data could be collected and examined, to control for these outside factors. c. Compute the actual correlation coefficient, and compare it with your prediction above. 5. (+5) Graphical Summary Compare two variables graphically, using something like the following. Include labels (e.g. color- code, axis labels, legend, etc.) so that your graphic is clear. • Scatter chart (two quantitative variables) • Double pie chart (two categorical variables) • Color-coded scatter chart (two quantitative and one categorical variable) • Bar or column chart (one categorical and one or more quantitative or categorical variables) • Line graph (one quantitative variable and time) 7 In MS Excel 2010, load the “Data Analysis” tool pack (File>Options>Add-ins for PC or Tools>Add-ins for Mac), and then select Data>Data Analysis>Histogram. Select the Input Range and Bin Range, and be sure to select the box for “Chart Output”. Note that a bar chart is not the same as a histogram. 4 • Bubble chart (three quantitative variables) Briefly describe some facet of the relationship between the two variables that is apparent in the type of graphic you chose. Part 2 – Statistical Inference (+28) Do the following, stating any important assumptions that your answers rely on.8 You do not need to write out all of your computations, but should make clear how you arrived at your answers. 1. Mean a. (+2) For at least one quantitative variable, find a point estimate of the underlying population mean ??.9 Compute a confidence interval for ??, at a confidence level of your choice.10 b. (+2) Perform a one- or two-sided test, at the level of your choice, of the hypothesis that ?? is equal to a specific value of your choice. State the associated p-value. 2. Standard Deviation (OPTIONAL; must do 2 or 5 or 6) a. (+2) For at least one quantitative variable, find a point estimate of the underlying population standard deviation ??. Compute a confidence interval for ??, at a
Answered 4 days AfterApr 07, 2021

Answer To: 1 Econ 378 Data Analysis Project Overview This project gives you hands-on experience summarizing and...

Swapnil answered on Apr 10 2021
127 Votes
World Happiness Report
Introduction:
World happiness is an indicator of the state of global happiness. Starting from 2012, world happiness reports have been done to evaluating the happiness scores in countries around the world and factors e.g., economics, psychology, survey analysis, national
statistics, health, public policy, and more, that could have potential influences on happiness scores. For example, the economy could affect people’s life quality and thus influence happiness. In this report, we will explore the relationship between factors and happiness and show the strength of the relationship by data visualization.
Description:
The world happiness is basically landmark survey for the state of global happiness. The first report was published in the 2012, the next is in 2013, the next is in 2015, and the last is in the 2016 update for the world happiness report. The world happiness 2017 which basically ranks the 155 countries for their happiness level and it was released by the united nations in the event of celebrating the international happiness on march 20th. The repost can give the gaining the recognition as the governments and the organization and the civil society that can use the happiness for the indicating the information of their policy making decisions. The leading exerts across the fields for economics and the psychology and the national statistics and the health, public policy and much more. The progress for the nations where the report can give the state of happiness in the world today that can show the new science of happiness that explains the personal and the national variations in the happiness.
The happiness scores and the rankings to use the data from the Gallup World Poll. The scores are basically based on the answers for the life evaluation that has been asked in the poll. The scores are from nationally representative samples to their own current lives and the scale of that scores. The scores and are the nationally representative sample for the years of 2013 to 2016 and it can be used for the Gallup weights to make the estimation of it. The columns can be followed to the happiness estimation to the extending the six factors of it. The economic production and the social support for the life expectancy and the freedom and the generosity will be used for the contributing to the making life evaluation higher in the hypothetical country that has the values to equal the world with lowest national averages. So they did not have the impact on the total score that has been reported to each country.
Dataset:
We are extracted the world happiness dataset from Kaggle, with 5 .csv files for world happiness reports from years 2015 to 2019. The dataset contains six factors as columns: economic production, social support, life expectancy, freedom, absence of corruption, and generosity. Here is an example data of countries’ happiness scores and other factors related (e.g., GDP per capita, social...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here