Microsoft Word - INSY 5337.docx
INSY XXXXXXXXXXAssignment 4 (6pts)
1. HDP Sandbox
2. VMware or VirtualBox
3. Sample data “retail-store-logs-sample-data1.zip”
4. Use reference documentation below to complete the assignment
1. Logon to Ambari page. [**Ref]
2. Open Data Analytics studio
3. Create first table products by uploading product.tsv
4. Create second table users by uploading users.tsv
5. Create third table omniturelogs via query for omniture-logs.tsv
6. Load data and run query to import data into third table. Execute query.
7. Save a query to create table omniture a refined subset of data with only a handful of fields
(ts, ip, url, swid, city, country, state) [**Ref]
8. Run saved query above. [**Ref]
9. Join data from multiple tables: query that creates table webloganalytics from omniture
table and users table. [**Ref]
10. Run query to view data in webloganalytics and download table as .csv file in your local
1. *Run query to show data in products table, limit 10.
2. *Run query to show data in users table, limit 10.
3. *Run query to show data in omniturelogs table, limit 10.
4. *Run saved query above in step 7, limit 10.
5. *Run query to show data in webloganalytics table, limit 10.
6. *Screenshot of downloaded csv file.
1. Each screenshot must include your name and student id as shown below: (In Text editor)
* Screenshot required