Assignment: Spark Learning Outcomes¶In this assignment, you will do the following: · Import Dataset to Spark Databricks environment · Create tables for data imported · Perform basic data analysis...

1 answer below »
This assignment is on Apache spark using Databricks. Please see attached.


Assignment: Spark Learning Outcomes¶In this assignment, you will do the following: · Import Dataset to Spark Databricks environment · Create tables for data imported · Perform basic data analysis using transformations and Spark SQL Your assignment must be submitted in the following file types: · Databricks notebook file (DBC), and · Portable Document Format (PDF) · OR · Microsoft Word (DOC or DOCX) 1. Import accompanying notebook - https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/3612601988648515/2314043357335221/6996201554190232/latest.html (Note: how to import notebook to Databricks: https://docs.databricks.com/user-guide/notebooks/index.html (Links to an external site) 2. Import Data: The easiest would be to download the file from the github to your computer: https://github.com/dmatrix/examples/blob/master/spark/databricks/notebooks/py/data/iot_devices.json (Links to an external site) and then import it to Databricks (Note: How to import file: https://docs.databricks.com/user-guide/tables.html#create-table-ui) (Links to an external site.) 3. Run it. (Note: don't forget to create cluster and attach the imported notebook to it (left upper corner: button `detached`) before trying to run it. ** Questions ** ( 12 marks) 1. Answer following questions: 1.1 How many sensor pads are reported to be from Poland (2 marks) 1.2 How many different lcds are present in the dataset (2 marks) 1.3 Find 5 countries that have the largest number of MAC devices used (2 marks) 1.4 Introduce a new interesting analytics / algorithm you could deduct from this dataset (2 marks) ** Bonus: 4 marks for using MLLib in 2.4 - https://spark.apache.org/docs/latest/ml-guide.html.
Answered Same DayFeb 25, 2021

Answer To: Assignment: Spark Learning Outcomes¶In this assignment, you will do the following: · Import Dataset...

Ximi answered on Feb 28 2021
143 Votes
solutiondatabricks
1.1 How many sensor pads are reported to be from Poland
Answer - 1413
1.2 Ho
w many different lcds are present in the dataset
Answer -
+------+-----+
| lcd|count|
+------+-----+
| green|49699|
|yellow|99051|
| red|49414|
+------+-----+
1.3 Find 5...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here