Assignment 3 0. Use Rmarkdown to do the following tasks (2). Please note that the presentation of the document and the range of Rmarkdown features/functions used are matter.1. Explain the topic...

R coding



Assignment 3 0. Use Rmarkdown to do the following tasks (2). Please note that the presentation of the document and the range of Rmarkdown features/functions used are matter. 1. Explain the topic model. Give a real world example application of the topic model. (4) 2. Download the Twitter dataset (rdmTweets-201306.RData) from the course website and do the following. (6) • Text cleaning: remove URLs, convert to lower case, and remove non-English letters or space. • Count the frequency of words “data” and “mining”. • Plot the word cloud. • Use a topic modelling algorithm to fit the Twitter data to 8 topics. Find the top 6 frequent terms (words) in each topic. 3. What is stream data? Give a real world example of a stream-data system/application. Explain the challenges of algorithms for stream data analysis. (3) 4. Select a data mining algorithm that is applicable for stream data and explain it in details. (3) 5. Create a data stream of two dimensions data points. The data points will follow Gaussian distribution with 5% noise and belong to 4 clusters. Compare the performance of the following clustering methods in terms of precision, recall, and F1. (4) • Use Reservoir sampling to sample 200 data points from 500 data points of the stream. Use K-means to cluster the points in the reservoir into 5 groups, and use 100 points from the stream to evaluate the performance of K-means. • Use Windowing method to get 200 data points from 500 data points of the stream. Use K-means to cluster the points in the window into 5 groups, and use 100 points from the stream to evaluate the performance of K-means. • Apply the D-Stream clustering method to 500 points from the stream with gridsize=0.1, and use 100 points from the stream to evaluate the performance of D- stream. 6. What is geographical data analysis? Explain a real world application of geographical information system. (4) 7. Use spatial data analysis packages in R, including sp, maps, and ggmap, do the following tasks. (4) • Draw a map of Australia where each city is represented as a dot. Highlight cities with population more than one million people. • Use ggmap to draw the Google map of Australia. Add makers for big cities with more than one million people. The bigger the city the larger the size of the marker.
Oct 27, 2022
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here