Option #1 Clustering using CARS dataset In this assignment, assume that you work in an automobile industry to examine different makes and models of automobiles. Your company is particularly interested...

1 answer below »

Option #1Clustering using CARS dataset


In this assignment, assume that you work in an automobile industry to examine different makes and models of automobiles. Your company is particularly interested in clustering of the variables so that they can determine if the data are suitable for use in the next phase of their upcoming analytics project. Complete the following tasks:



  1. Locate the data from Libraries My Libraries SASHELP CARS.

  2. Using correlations and scatterplots, examine the linear relationships among the quantitative variables:MSRP, Invoice, EngineSize, Cylinder, Horsepower, MPG_City, MPG_Highway, Weight, Wheelbase,andLength. Comment on the relationships.

  3. Provide the summary statistics for all quantitative variables usingOriginas a classification variable. Comment on the summary statistics. What are some of the key characteristics from the data?

  4. Provide a graphical summary (such as histogram) for all quantitative variables usingOriginas a classification variable. Comment on the graphs.

  5. Using the whole dataset based on quantitative variables, conduct the cluster analysis using either PROC FASTCLUS or PROC VARCLUS or both. Make sure to interpret the SAS output.

  6. Repeat part 5 using the variableOriginas a classification (group) variable.


Note: PROC FASTCLUS is based on k-means procedure. More information including description, syntax, and examples can be found from the website:SAS FASTCLUS(Links to an external site.)


PROC VARCLUS is based on linear combinations of the variables in the cluster. More information including description, syntax, and examples can be found from the website:SAS VARCLUS(Links to an external site.)


For each part, take the screenshots of relevant SAS output and paste them into a Word document. Include all relevant calculations and your answers to all assignment items and submit the document to Canvas for grading. Clearly label all elements in your submission. In addition, provide a short description of any challenge(s) you faced during this assignment.


Your submission should be three to four pages in length and conform to theCSU Global Writing Center(Links to an external site.). Review the grading rubric to see how you will be graded for this assignment.

Answered Same DayJan 17, 2021

Answer To: Option #1 Clustering using CARS dataset In this assignment, assume that you work in an automobile...

Aarti answered on Jan 18 2021
141 Votes
2. CORRELATION AND SCATTERPLOTS: Invoice was correlated with all the other numeric variables and results are as follows:
We can see that Invoice is strongly and positively correlated with MSRP showing data points close to each other and moving from left to right. However, MPG_city and MPG_highway shows weak and negative relation with Invoice.
3. SUMMARY STATISTICS: Below, is the attached summary statistics of the data with Origin as the classification variable. We can draw out following insights about the data:
· The highest...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here