Learning objectives:The key frameworks and concepts covered in modules 1–6 are particularly relevant...

Question

Learning objectives:The key frameworks and concepts covered in modules 1–6 are particularly relevant for this assignment. Assignment 2 relates to the course learning objectives 1, 2 and 4:1. demonstrate applied knowledge of people, markets, finances, technology and management in a global context of business intelligence practice (data warehouse design, data mining process, data visualisation and performance management) and resulting organisational change and how these apply to implementation of business intelligence in organisation systems and business processes2. identify and solve complex organisational problems creatively and practically through the use of business intelligence and critically reflect on how evidence based decision making and sustainable business performance management can effectively address real world problems4. demonstrate the ability to communicate effectively in a clear and concise manner in written report style for senior management with correct and appropriate acknowledgment of main ideas presented and discussed.Assignment 2 consists of three main tasks and a number of sub tasksTask 1 Data Management Architectures (Worth 30 Marks)For Task 1 you are required to conduct a critical literature review of the related concepts of datawarehouses, data lakes and data marts in order to complete two sub tasks.Task 1.1) Define and discuss the concepts of (1) a data warehouse, (2) a data lake and (3) a datamart drawing relevant and reputable literature (about 500 words).Task 1.2) Explain how a data warehouse, a data lake and a data mart would be used in anorganization and in your answer provide some real world examples of how each would be used inan organization (about 500 words).Task 2 Exploratory Data Analysis and Linear Regression Analysis (Worth 40Marks)For Task 2 consider a set of observations on a large number of white wine varieties involvingtheir chemical properties and ranking by wine tasters contained in white-wines.csv data set. Wineindustry has been growing steadily as social drinking of wine is on the rise. The price of a winelargely depends on wine appreciation by wine tasters which may have a high degree of variability.Another key factor in wine certification and quality assessment is physicochemical tests whichare laboratory-based and take into account factors like acidity, pH level, presence of sugar andother chemical properties.For wine producers, it would be of interest if wine tasters’ perception of wine quality after tastingcan be related to the chemical properties of wine so that certification and quality assessment andassurance process of wines is more rigorous.The white-wines.csv data set consists of 4898 white wine varieties in total (records). All wines arefrom one wine producing region. The white-wines.csv data set was collected on 12 differentproperties of wines. Quality is based on sensory data (wine tasters’ perception of the quality of awine), the rest are based on chemical properties of wines including density, acidity, alcoholcontent etc. All chemical properties of wines are coded as continuous numeric variables. Qualityis an ordinal variable with a possible ranking from 1 (worst) to 10 (best). Each white wine varietyis tasted by three independent tasters and final rank assigned is the median rank given by tasters.See Table 1 White Wines Data Set Data Dictionary for full details of white-wines.csv data set.Discuss the key results of your exploratory data analysis presented in Table 2.1 and provide arationale for why you have selected your five top variables for predicting a wine taster’s ranking ofa white wine drawing on the results of your EDA analysis and relevant literature (About 500 words).Task 2.2) Build a Linear Regression model for predicting the quality ranking of a white wine usinga RapidMiner data mining process and an appropriate set of data mining operators and a reducedset of variables from the white-wines.csv data set determined by your exploratory data analysis inTask 2.1. Provide these outputs from RapidMiner (1) Final Linear Regression Model process and(2) Summary Table of Results of Final Linear Regression Model for Task 2.2 for white-wines.csvdata set.Briefly describe your final Linear Regression Model Process, and discuss the results of theFinal Linear Regression Model for white wine.csv data set drawing on the key outputs(coefficient, standardised coefficients, t-statistics values, p-values and significance levels etc) forpredicting Wine Quality and relevant supporting literature on the interpretation of a LinearRegression Model (About 500 words).Include all appropriate RapidMiner outputs such as RapidMiner Processes, Graphs and Tables thatsupport the key aspects of your exploratory data analysis and linear regression model analysis ofthe white-wines.csv data set in your Assignment 2 report. Note you need export these outputsfrom RapidMiner using the File/Print/Export Image option and include in Task 2 whererelevant or in Appendix A of Assignment 2 report.

Learning objectives: The key frameworks and concepts covered in modules 1–6 are particularly relevant for this assignment. Assignment 2 relates to the course learning objectives 1, 2 and 4: 1....

Get Answer To This Question

Related Questions & Answers

Submit New Assignment