The goal is to model wine quality based on physico-chemical tests.
The
dataset
Download dataset
contains the following attributes
Attribute Information:
Input variables (based on physicochemical tests):
1 - fixed acidity
2 - volatile acidity
3 - citric acid
4 - residual sugar
5 - chlorides
6 - free sulfur dioxide
7 - total sulfur dioxide
8 - density
9 - pH
10 - sulphates
11 - alcohol
Output variable (based on sensory data):
12 - quality (discrete score between 0 and 8)
Your goal is to build
1. A decision tree model (gini index) that uses physicochemical tests to predict wine quality.
2. Perform 10 fold cross validation and present precision, recall, accuracy, and F1 score.
3. A linear regression model that uses physicochemical tests to predict wine quality.
4. Perform 70-30 holdout and present Mean Squared Error of the developed model.
(3+3+3+3)
Submit a folder that contains your code and results.
a. Code
b. Results (question no 1 - decision tree diagram, question 2 performance measures, question 3 linear regression coefficients, question no 4 MSE score)
data attached