Hello,My deadline for this homework just got extended to Monday, 19th of April, 9pm EST.Could you...

Question

Hello,My deadline for this homework just got extended to Monday, 19th of April, 9pm EST.Could you kindly help me with this homework?I have attached instructions for an R homework I need help with again. It has to do with Web-scrape and SQL.This HW needs to bedone in an 1) R markdown file and 2) knitted as an HTML too.My budget is $110 for thisKind regards,RK

Assignment Instructions: Your final document should be an 1) RMarkdown file and 2) HTML file knitted from R Markdown. In answering each of the following questions please include a) the question as a header in your Rmarkdown report, b) then include the raw code that you used to generate your results, and c) the top ten rows/values/or elements of the resulting dataframe, vector, or list created in your results (unless a lesser amount is requested). Feel free to refer to any R scripts provided throughout the course to answer the following questions: 1. Web-scrape all table data from the following web-page and build a data frame in R. Limit your final table to include columns for "Package", "Item", "Title", "Rows", and "Cols". Print the first five rows of the table. http://vincentarelbundock.github.io/Rdatasets/datasets.html 2. Web-scrape the full links to every CSV file listed in the CSV column of the web- page. Add a new column to your data frame that includes these links. Name the column "CSV Links". Print the first five rows of the table. Note: You may need to use string operations to recreate the full link to the csv files after they are scraped. 3. Use R code/functions to search the "Title" column to return the row of data with the title, "Violent Crime Rates by US State" 4. Import the csv file into R using the full link listed in the "CSV Links" column for this dataset. Create a new variable called "Violent_crime" that adds together data for all columns in the dataset that contain violent crime data (i.e.-add data from assault, murder, and rape columns together in new column called "Violent_crime"). 5. Using one or more of the following example datasets that come preloaded in R (see ?data(state) for more information), add state region codes, state divisions, and all of the variables from the "state.x77" dataset to your Violent Crime Rates data frame. Print the first five lines of your new dataset. • state.abb, state.area, state.center, state.division, state.name, state.region Note 1: state names can be extracted to new columns for joins from these datasets by using row.names() if needed. Note 2: Some of these state datasets may be matrix objects or lists. Be sure to convert them to data frames before joining them if needed. 6. Calculate the average for each numeric column in the dataset. 7. Group the data by region and then calculate the average for each numeric column in the dataset per region. Which region had the highest population (data is from the late 1970s)? Which region had the most violent crime? http://vincentarelbundock.github.io/Rdatasets/datasets.html 8. Group the data by division and then calculate the average for each numeric column in the dataset per division. Which division had the highest population (data is from the late 1970s)? Which division had the most violent crime? 9. What SQL statement would you write to return two columns denoting income and Illiteracy in your state data? 10. What SQL statement would you write to return two columns denoting income and Illiteracy in your state data and sort the data from the highest to lowest income values? 11. What SQL statement would you write to return two columns denoting income and Illiteracy in your state data and sort the data from the highest to lowest income values and limit the data to incomes at or higher than 5000? 12. What SQL statement would you write to return two columns denoting income and Illiteracy in your state data and sort the data from the highest to lowest income values and limit the data to incomes at or higher than 5000 and return the top 10 rows only? 13. Create a new data frame that includes two columns from your state data denoting state names and violent crimes. Spread the state names to 50 unique columns with a single row that includes the violent crime data per state. Print the first five columns of the new dataset. 14. Take the dataset from question 13 and use a function from the apply family of functions to return a named list with each name denoting a state and each value per state indicating the square root of the value for violent crimes. 15. Subset the list you created in question 14 to extract values for Texas and New York.

assignment-vc0wi2fu.pdf

Abr Writing · Accepted Answer

assignment.html
Assignment
18/04/2021
Question 1
Getting the HTML response from the given web page using rvest package.
web.link %
  html_nodes("table") %>%
  .[[2]] %>%
  html_table(fill = TRUE) %>%
  as.data.frame %>%
  select(Package, Item, Title, Rows, Cols)
Printing the first five rows of the table
head(data, n = 5)
  Package          Item
1     AER       Affairs
2     AER  ArgentinaCPI
3     AER     BankWages
4     AER BenderlyZwick
5     AER     BondYield
                                                         Title Rows Cols
1                             Fair's Extramarital Affairs Data  601    9
2                            Consumer Price Index in Argentina   80    2
3                                                   Bank Wages  474    4
4 Benderly and Zwick Data: Inflation, Growth and Stock Returns   31    5
5                                              Bond Yield Data   60    2
Question 2
Extracting all the links for all the CSV files
links %
  html_nodes("a") %>%
  html_attr('href') %>%
  as.data.frame %>%
  add_rownames %>%
  filter(rowname %in% seq(1,2*nrow(data),2)) %>%
  select(-one_of("rowname"))
Warning: `add_rownames()` is deprecated as of dplyr 1.0.0.
Please use `tibble::rownames_to_column()` instead.
This warning is displayed once every 8 hours.
Call `lifecycle::last_warnings()` to see where this warning was generated.
Adding a new column CSV Links to the data.
data$`CSV Links` %
  as.data.frame
state.x77$X %
  select(where(is.numeric)) %>%
  summarise_all(mean) %>%
  t %>%
  as.data.frame %>%
  rename(Mean = V1)
                      Mean
Murder.x           7.78800
Assault          170.76000
UrbanPop          65.54000
Rape              21.23200
Violent_crime    199.78000
Population      4246.42000
Income          4435.80000
Illiteracy         1.17000
Life Exp          70.87860
Murder.y           7.37800
HS Grad           53.10800
Frost            104.46000
Area           70735.88000
state.area     72367.98000
state.center.x   -92.46414
state.center.y    39.41074
Question 7
violent_data %>%
  group_by(state.region) %>%
  select(where(is.numeric), state.region) %>%
  summarise_all(mean) %>%
  arrange(desc(Population))
# A tibble: 4 x 17
  state.region Murder.x Assault UrbanPop  Rape Violent_crime Population Income
                                      
1 Northeast        4.7     127.     70.6  13.8          145.      5495.  4570.
2 North Centr~     5.7     120.     64.4  18.4          144.      4803   4611.
3 South           11.7     220      59.4  21.2          253.      4208.  4012.
4 West             7.03    187.     70.6  29.1          223.      2915.  4703.
# ... with 9 more variables: Illiteracy , `Life Exp` ,
#   Murder.y , `HS Grad` , Frost , Area , state.area ,
#   state.center.x , state.center.y 
The Northeast region had the highest population.
violent_data %>%
  group_by(state.region) %>%
  select(where(is.numeric), state.region) %>%
  summarise_all(mean) %>%
  arrange(desc(Violent_crime))
# A tibble: 4 x 17
  state.region Murder.x Assault UrbanPop  Rape Violent_crime Population Income
                                      
1 South           11.7     220      59.4  21.2          253.      4208.  4012.
2 West             7.03    187.     70.6  29.1          223.      2915.  4703.
3 Northeast        4.7     127.     70.6  13.8          145.      5495.  4570.
4 North Centr~     5.7     120.     64.4  18.4          144.      4803   4611.
# ... with 9 more variables: Illiteracy , `Life Exp` ,
#   Murder.y , `HS Grad` , Frost , Area , state.area ,
#   state.center.x , state.center.y 
The South region had the most violent crime.
Question 8
violent_data %>%
  group_by(state.division) %>%
  select(where(is.numeric), state.division) %>%
  summarise_all(mean) %>%
  arrange(desc(Population))
# A tibble: 9 x 17
  state.division Murder.

Assignment Instructions: Your final document should be an 1) RMarkdown file and 2) HTML file knitted from R Markdown. In answering each of the following questions please include a) the question as a...

Answer To: Assignment Instructions: Your final document should be an 1) RMarkdown file and 2) HTML file knitted...

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment