Assignment 3 PS 3780 Data Literacy & Visualization, Summer 2022 Due Date: Thursday, June 9, 2022 at 11:59 p.m. Please save your answer to these questions as one .pdf �le (use the �save as� function in...

1 answer below »
instructions below


Assignment 3 PS 3780 Data Literacy & Visualization, Summer 2022 Due Date: Thursday, June 9, 2022 at 11:59 p.m. Please save your answer to these questions as one .pdf �le (use the �save as� function in most word processors). Be sure to include your name, your teammate's name if there is anyone, and the assignment number. Submit the �le to Carmen by the due date. Basics of R CIA World Factbook Use the CIA World Factbook country comparison guide to download a numeric .csv dataset https://www.cia.gov/the-world-factbook/references/guide-to-country-comparisons/. Import the dataset into R. Please answer the following questions with R and copy the commands that you use for answering each question. 1. (.5 pt) Which dataset did you download and what is the stored name of the dataset in R? 2. (.5 pt) What is the average value of your chosen variable? What is the median value of your chosen variable? 3. (.5 pt) Does that average value happen to be the actual value of any country? 4. (.5 pt) Does that median value happen to be the actual value of any country? 5. (.5 pt) Which country has the lowest value? 6. (.5 pt) Which country is ranked 10th, 30th, and 50th respectively? 7. (.5 pt) Which country ranks higher in the variable that you choose, Namibia or Botswana (the data might be missing in your dataset, but at least you need to write down the R command that you use for inquiry)? 1 https://www.cia.gov/the-world-factbook/references/guide-to-country-comparisons/ Presidential Approval Visit 538 to �nd data on the popularity of Joe Biden through the �rst term of his pres- ident. At the bottom of their interactive, https://projects.fivethirtyeight.com/ biden-approval-rating/, there is a link to download the associated polls. Import the dataset into R. Please answer the following questions with R and copy the commands that you use for answering each question. 1. (.5 pt) Is the dataset properly read in? How many observations and variables are in the dataset? 2. (.5 pt) List the di�erent values of �population�. 3. (1 pt) What is the average approval for polls of each �population�? Does there appear to be much of a di�erence? (Hint: Create and save a subset of the data for each methodology using indexing, subsetting, or �ltering and �nd the mean of that new dataset.) Presidential Approval Advanced Use the same 538 dataset to address the following questions. Again, copy the com- mands that you use. When asked for the correlation between two variables, use the function cor( x , y ) for the speci�c x and y that you want to compare. Make sure to use the form `dataset$variable' to indicate a variable that exists within a dataset. 1. (.5 pt) Using approve and disapprove, create two new variables in the dataset: a variable named net measuring the di�erence of approve and disapprove (subtract the variables) and a variable named ratio measuring the ratio of approve to dis- approve (divide the variables). What is the average of net and ratio? 2. (1 pt) What is the value of net and ratio (the two variables you just created) for the polls that had the largest and smallest sample size? 3. (1 pt) What is the correlation between the pairs net and sample size and ratio and sample size? How do these correlations relate to the values found in the previous question? 2 https://projects.fivethirtyeight.com/biden-approval-rating/ https://projects.fivethirtyeight.com/biden-approval-rating/ Assignment 3 PS 3780 Data Literacy & Visualization, Summer 2022 Due Date: Thursday, June 9, 2022 at 11:59 p.m. Please save your answer to these questions as one .pdf �le (use the �save as� function in most word processors). Be sure to include your name, your teammate's name if there is anyone, and the assignment number. Submit the �le to Carmen by the due date. Basics of R CIA World Factbook Use the CIA World Factbook country comparison guide to download a numeric .csv dataset https://www.cia.gov/the-world-factbook/references/guide-to-country-comparisons/. Import the dataset into R. Please answer the following questions with R and copy the commands that you use for answering each question. 1. (.5 pt) Which dataset did you download and what is the stored name of the dataset in R? 2. (.5 pt) What is the average value of your chosen variable? What is the median value of your chosen variable? 3. (.5 pt) Does that average value happen to be the actual value of any country? 4. (.5 pt) Does that median value happen to be the actual value of any country? 5. (.5 pt) Which country has the lowest value? 6. (.5 pt) Which country is ranked 10th, 30th, and 50th respectively? 7. (.5 pt) Which country ranks higher in the variable that you choose, Namibia or Botswana (the data might be missing in your dataset, but at least you need to write down the R command that you use for inquiry)? 1 https://www.cia.gov/the-world-factbook/references/guide-to-country-comparisons/ Presidential Approval Visit 538 to �nd data on the popularity of Joe Biden through the �rst term of his pres- ident. At the bottom of their interactive, https://projects.fivethirtyeight.com/ biden-approval-rating/, there is a link to download the associated polls. Import the dataset into R. Please answer the following questions with R and copy the commands that you use for answering each question. 1. (.5 pt) Is the dataset properly read in? How many observations and variables are in the dataset? 2. (.5 pt) List the di�erent values of �population�. 3. (1 pt) What is the average approval for polls of each �population�? Does there appear to be much of a di�erence? (Hint: Create and save a subset of the data for each methodology using indexing, subsetting, or �ltering and �nd the mean of that new dataset.) Presidential Approval Advanced Use the same 538 dataset to address the following questions. Again, copy the com- mands that you use. When asked for the correlation between two variables, use the function cor( x , y ) for the speci�c x and y that you want to compare. Make sure to use the form `dataset$variable' to indicate a variable that exists within a dataset. 1. (.5 pt) Using approve and disapprove, create two new variables in the dataset: a variable named net measuring the di�erence of approve and disapprove (subtract the variables) and a variable named ratio measuring the ratio of approve to dis- approve (divide the variables). What is the average of net and ratio? 2. (1 pt) What is the value of net and ratio (the two variables you just created) for the polls that had the largest and smallest sample size? 3. (1 pt) What is the correlation between the pairs net and sample size and ratio and sample size? How do these correlations relate to the values found in the previous question? 2 https://projects.fivethirtyeight.com/biden-approval-rating/ https://projects.fivethirtyeight.com/biden-approval-rating/
Answered Same DayJun 08, 2022

Answer To: Assignment 3 PS 3780 Data Literacy & Visualization, Summer 2022 Due Date: Thursday, June 9, 2022 at...

Subhanbasha answered on Jun 09 2022
85 Votes
Answers
CIA World Factbook:
Question 1:
Ans: Here I have downloaded population data which contains population of various countries in the world. And I saved the data set in the R as names Pop_data. We can use this data by using the nam
e to this. Here I have used read.csv function to read the data.
Code:
# Question 1 - Reading the data
Pop_data <- read.csv("Population data.csv")
# Cleaning data - change data type as numeric
Pop_data$value <- as.numeric(gsub(",","",Pop_data$value))
Question 2:
Ans: Here I have chosen the variable value which is nothing but the size of the population by country. The mean value of the variable is 33355923 and the median value of the variable is 5454533. There is difference between the values of mean and median.
Code:
# Mean and median of value(population)
Mean_value <- mean(Pop_data$value,na.rm=TRUE)
Median_value <- median(Pop_data$value,na.rm=TRUE)
Question 3:
Ans: The average value of the population is not matched with the any country population value.
Code:
# Question 3 - checking for any mean value equal to country value
any(Pop_data$value==Mean_value)
Question 4:
Ans: The median value of the population is matched with the Central African Republic country population value.
Code:
# Question 4 - checking for any median value equal to country value
any(Pop_data$value==Median_value)
# Matched country - found for median
Pop_data[Pop_data$value==Median_value,'name']
Question 5:
Ans: Pitcairn Islands is having the lowest value of the population across the countries.
Code:
# Question 5 - Lowest value country
Pop_data[Pop_data$value==min(Pop_data$value),'name']
Question 6:
Ans: The 10th Ranked country is Mexico, the 30th Ranked country is Sudan and the 50th Ranked country is Venezuela.
Code:
# Question 6
# 10th ranked country
Pop_data[Pop_data$ranking ==10,'name']
# 30th ranked country
Pop_data[Pop_data$ranking ==30,'name']
# 50th ranked country
Pop_data[Pop_data$ranking ==50,'name']
Question 7:
Ans: China has ranked higher in the data as considering population.
Code:
# Question 7
# Highest ranked country of selectd value population
Pop_data[Pop_data$value ==max(Pop_data$value),'name']
Presidential Approval
Question 1:
Ans: By using the read.csv function loaded the data into R properly and there are 3714 observations and 22 variables(columns).
Code:
# Question 1 -...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here