1 ECON7930-2021Spring Final Project Due date: 14 May 2021, 11:59 pm (before midnight) (I need to submit students' individual grades to the department on or before noon, 18 May. Please be merciful to...

1 answer below »
Require R code and a short report


1 ECON7930-2021Spring Final Project Due date: 14 May 2021, 11:59 pm (before midnight) (I need to submit students' individual grades to the department on or before noon, 18 May. Please be merciful to give me three days to grade your works. Thanks very much in advance!) Instruction: As explained in class, you have two tasks to finish for the final project. Each student needs to finish them and submit the related product (e.g., graph), code, and report to the instructor on your own. You can either download the related example code and data from google drive: https://drive.google.com/drive/folders/1xM2- ntrBbbHaWJ2o5Zo4w6UQ8HGYDBFC?usp=sharing Or from baiduyunpan: Link: https://pan.baidu.com/s/1sXEwL-nN7wQoIGptILUqXQ Password: m8mr Task 1. Visualize spatial data In the "final project" folder that you can download from the above link, there is a folder titled "spatial_visualization." Inside, I prepared 18 different visualization examples for making different types of maps. Personally, I spent quite some time putting them together. So I hope that you appreciate and enjoy learning from these examples. In the future, if you are interested in exploring more, you can also use them as resources for further study. For the final project, you only need to choose one example, use any data you choose and find online to replicate the example using new data. As you may recall, for making a map, we need base map files (often in shapefiles or geojson, or other spatial data formats). Below are some great places to finding these base map files for free. DIVA-GIS https://www.diva-gis.org/gdata Base maps for different levels of administrative units in different countries ArcGIS Hub https://hub.arcgis.com/ Harvard GIS Data Project http://hgl.harvard.edu:8080/opengeoportal/ Harvard China GIS Data Project http://worldmap.harvard.edu/chinamap Hong Kong GEOINFO MAP https://www.map.gov.hk/gm/ You can also use the API service from OpenStreetMap, Mapbox, Geocodio, etc., to download the real-time street map. Consult examples 1, 3, 5, 13 for more detail. Also, the the following blog post or google for more free great GIS resources: https://gisgeography.com/best-free-gis-data-sources-raster-vector/ If you have questions or problems regarding where to find the base maps or data, you https://drive.google.com/drive/folders/1xM2-ntrBbbHaWJ2o5Zo4w6UQ8HGYDBFC?usp=sharing https://drive.google.com/drive/folders/1xM2-ntrBbbHaWJ2o5Zo4w6UQ8HGYDBFC?usp=sharing 2 can contact me for discussion. But do try your best before contacting me. After you find the base map files and the data you plan to visualize. You can use them to replicate the example you choose. For the final product, you need to submit a graph (map) outputted from your code, the related R-script to produce it, and a short report (1-2 pages contain figures) to explain where do you find the data used, what kind of basic procedure you implement to clear up, manipulate the data before drawing the map. And eventually, a few sentences to tell me what we can learn from the map, what kind of spatial pattern you see from it. Task 2. Topic Model For the second task, please choose one corpus (textual data) among the seven datasets that I sent you (in the "topic model" subfolder), and use the topic model method we introduced in class to analyze the data. The aim is to find out the meaning topics among the corpus. You can either choose the simple LDA or the more advanced variants. For the final product, you need to submit both the related R-script for the topic model analysis and a short report (1-3 pages contains figures and tables). In the report, please explain to me the data you choose, what kind of pre-processing you do before data analysis, what topic model(s) you choose to perform, how you choose the key K (number of topics), and what is the outcome (some tables and graphs to show me the top keywords for each topic etc.). Based on the outcome, tell me about whether you think the model(s) you choose produce meaningful, coherent topics from the documents, do the topics make sense to you, and maybe some validation of the topic model based on the methods we discussed in class. Final Submission After you finish both tasks, please send me the related files via moodle by the deadline. For the report, please put the reports for both projects in one text file (it can be a word file or a markdown or any other text format that I can open), tables and graphs included. And please use Time New Roman font, 12pt font size, and reasonable margins throughout. For the source codes, I shall be able to replicate all your work using them. Finally, some of the example code and data I downloaded them online (via application) and some of them were prepared by myself. Please honor the property right and my work, and only use them for the final project or your personal study, do NOT share them online or with others. Thanks so much! Enjoy! As always, wish you can learn something interesting and useful from the project!
Answered 3 days AfterMay 11, 2021

Answer To: 1 ECON7930-2021Spring Final Project Due date: 14 May 2021, 11:59 pm (before midnight) (I need to...

Saravana answered on May 14 2021
132 Votes
# clean current workspace
rm(list=ls(all=T))
# set options
options(stringsAsFactors = F) # no automatic data transformation
optio
ns("scipen" = 100, "digits" = 4) # supress math annotation
# load libraries
library(tm)
library(topicmodels)
library(reshape2)
library(ggplot2)
library(wordcloud)
library(pals)
# load data
# load data
#textdata <- base::readRDS(url("https://slcladal.github.io/data/sotu_paragraphs.rda", "rb"))
dt = read.table("/media/priyan/Files/GreyNodes/Assignment 6/Ques2/raw_partner_headlines.csv", header = TRUE, sep = ",", fill = TRUE )
# load stopwords
english_stopwords <- readLines("https://slcladal.github.io/resources/stopwords_en.txt", encoding = "UTF-8")
# create corpus object
colnames(dt)[2]<- "text"
dt$doc_id <- seq(1, nrow(dt))
corpus <- tm :: Corpus(DataframeSource(dt))
### Peprocessing
library(SnowballC)
# Preprocessing chain
processedCorpus <- tm_map(corpus, content_transformer(tolower))
processedCorpus <- tm_map(processedCorpus, removeWords, english_stopwords)
processedCorpus <- tm_map(processedCorpus, removePunctuation, preserve_intra_word_dashes = TRUE)
processedCorpus <- tm_map(processedCorpus, removeNumbers)
processedCorpus <- tm_map(processedCorpus, tm :: stemDocument, language = "en")
processedCorpus <- tm_map(processedCorpus, stripWhitespace)
### preparing for LDA model
# compute document term matrix with terms >=...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here