You are to submit either a GitHub (for a statistic visualization) or a shinyapps.io (for an interactive visualization) link to your final project (remember to include the data in your repo so I can...

1 answer below »
You are to submit either a GitHub (for a statistic visualization) or a shinyapps.io (for an interactive visualization) link to your final project (remember to include the data in your repo so I can run your code). In addition, you are to submit a link to the GitHub repository with your code (two links total).


The total number of points assigned to your final project submission is 20, distributed as follows:


2 points for a working link to GitHub (for a statistic visualization) or a shinyapps.io (for an interactive visualization)
2 points for a description of the data and where the data was acquired
6 points, divided into 2 points for each of three plots (for a total of three different types of plot, from the different types we've seen in class)
2 points for the use of colorblind-friendly color schemes
2 points for the use of the appropriate color scheme (categorical, divergent, or continuous) given the variable mapped to the color/fill aesthetics
2 points for appropriate axes scales and labels (meaning they are legible, not overlapping, and clearly state what is being displayed in the plot)
2 points for titles and captions that make it clear what the plot is about
2 points for appropriate ordering of group levels (examples: unordered categorical variables are displayed not according to alphabetical order, but reordered by the numeric variable used; ordered categorical variables are shown in their correct order)
Again, what you need to submit is a LINK to your visualization and a LINK to your GitHub repo with your code.
Answered Same DayOct 15, 2021

Answer To: You are to submit either a GitHub (for a statistic visualization) or a shinyapps.io (for an...

Nithin answered on Oct 15 2021
114 Votes
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "DataVizualisation.Ipynb",
"provenance": []
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "qS8EH-GntbnQ"
},
"source": [
"## Data Visualuization Assignment\n",
"\n",
"#### Name : \n",
"\n",
"#### Link : https://colab.research.google.com/drive/1IzkpnxS6bZY7hKeNQ-x0dHK6rrSow0OQ?usp=sharing"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "FwOXPYrHtRj7"
},
"source": [
"## Importing necessary modules"
]
},
{
"cell_type": "code",
"metadata": {
"id": "-qjpjHQKVn_7"
},
"source": [
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import seaborn as sns"
],
"execution_count": 1,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "SwtEp_FTYkTd"
},
"source": [
"## Loading Dataset"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 424
},
"id": "Dl3QUnheYDw3",
"outputId": "db0e2e6c-86b9-4a51-c220-61954cd06fbb"
},
"source": [
"dataset = pd.read_csv('Iris.csv')\n",
"dataset"
],
"execution_count": 3,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"
\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
IdSepalLengthCmSepalWidthCmPetalLengthCmPetalWidthCmSpecies
015.13.51.40.2Iris-setosa
124.93.01.40.2Iris-setosa
234.73.21.30.2Iris-setosa
344.63.11.50.2Iris-setosa
455.03.61.40.2Iris-setosa
.....................
1451466.73.05.22.3Iris-virginica
1461476.32.55.01.9Iris-virginica
1471486.53.05.22.0Iris-virginica
1481496.23.45.42.3Iris-virginica
1491505.93.05.11.8Iris-virginica
\n",
"

150 rows × 6 columns

\n",
"
"
],
"text/plain": [
" Id SepalLengthCm ... PetalWidthCm Species\n",
"0 1 5.1 ... 0.2 Iris-setosa\n",
"1 2 4.9 ... 0.2 Iris-setosa\n",
"2 3 4.7 ... 0.2 Iris-setosa\n",
"3 4 4.6 ... 0.2 Iris-setosa\n",
"4 5 5.0 ... 0.2 Iris-setosa\n",
".. ... ... ... ... ...\n",
"145 146 6.7 ... 2.3 Iris-virginica\n",
"146 147 6.3 ... 1.9 Iris-virginica\n",
"147 148 6.5 ... 2.0 Iris-virginica\n",
"148 149 6.2 ... 2.3 Iris-virginica\n",
"149 150 5.9 ... 1.8 Iris-virginica\n",
"\n",
"[150 rows x 6 columns]"
]
},
"metadata": {},
"execution_count": 3
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "sUpDO28FYvYl"
},
"source": [
"## Describing Data"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 301
},
"id": "qmsq9LS1Yfv9",
"outputId": "3e99439c-6e98-420c-8d36-cfb47c78fe88"
},
"source": [
"dataset.describe()"
],
"execution_count": 5,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"
\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
SepalLengthCmSepalWidthCmPetalLengthCmPetalWidthCm
count150.000000150.000000150.000000150.000000
mean5.8433333.0540003.7586671.198667
std0.8280660.4335941.7644200.763161
min4.3000002.0000001.0000000.100000
25%5.1000002.8000001.6000000.300000
50%5.8000003.0000004.3500001.300000
75%6.4000003.3000005.1000001.800000
max7.9000004.4000006.9000002.500000
\n",
"
"
],
"text/plain": [
" SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm\n",
"count 150.000000 150.000000 150.000000 150.000000\n",
"mean 5.843333 3.054000 3.758667 1.198667\n",
"std 0.828066 0.433594 1.764420 0.763161\n",
"min 4.300000 2.000000 1.000000 0.100000\n",
"25% 5.100000 2.800000 1.600000 0.300000\n",
"50% 5.800000 3.000000 4.350000 1.300000\n",
"75% 6.400000 3.300000 5.100000 1.800000\n",
"max 7.900000 4.400000 6.900000 2.500000"
]
},
"metadata": {},
"execution_count": 5
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "XSau8UHZak8t"
},
"source": [
" The data set contains 3 species of 50 instances each, where each class refers to a type of iris plant. One class is linearly separable from the other 2; the latter are NOT linearly separable from each other. "
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 206
},
"id": "XS0nfZ0ma2zX",
"outputId": "e9e6e0bf-9474-4d0e-85bb-8d9b0caed240"
},
"source": [
"dataset = dataset.drop(columns=['Id'])\n",
"dataset.head()"
],
"execution_count": 6,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"
\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
SepalLengthCmSepalWidthCmPetalLengthCmPetalWidthCmSpecies
05.13.51.40.2Iris-setosa
14.93.01.40.2Iris-setosa
24.73.21.30.2Iris-setosa
34.63.11.50.2Iris-setosa
45.03.61.40.2Iris-setosa
\n",
"
"
],
"text/plain": [
" SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species\n",
"0 5.1 3.5 1.4 0.2 Iris-setosa\n",
"1 4.9 3.0 1.4 0.2 Iris-setosa\n",
"2 4.7 3.2 1.3 0.2 Iris-setosa\n",
"3 4.6 3.1 1.5 0.2 Iris-setosa\n",
"4 5.0 3.6 1.4 0.2 Iris-setosa"
]
},
"metadata": {},
"execution_count": 6
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "b--Y4sLYZMqZ"
},
"source": [
"The columns in this dataset are:\n",
"\n",
" Id\n",
" SepalLengthCm\n",
" SepalWidthCm\n",
" PetalLengthCm\n",
" PetalWidthCm\n",
" Species\n",
"\n",
"Column ID is removed as it doesn't serve any important function"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "x-VZvE7bb8YQ",
"outputId": "06c5e3ec-eb80-4ce9-9c29-25bb4c5b75b9"
},
"source": [
"dataset.info()"
],
"execution_count": 7,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"\n",
"RangeIndex: 150 entries, 0 to 149\n",
"Data columns (total 5 columns):\n",
" # Column Non-Null Count Dtype \n",
"--- ------ -------------- ----- \n",
" 0 SepalLengthCm 150 non-null float64\n",
" 1 SepalWidthCm 150 non-null float64\n",
" 2 PetalLengthCm 150 non-null float64\n",
" 3 PetalWidthCm 150 non-null float64\n",
" 4 Species 150 non-null object \n",
"dtypes: float64(4), object(1)\n",
"memory usage: 6.0+ KB\n"
]
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "tfv7vGAqbq8f"
},
"source": [
"## Data Source :\n",
"\n",
"Creator:\n",
"\n",
"R.A. Fisher\n",
"\n",
"Donor:\n",
"\n",
"Michael Marshall \n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Fnc_005BaNbn"
},
"source": [
"Dataset can be found at : https://www.kaggle.com/uciml/iris"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "UT4bpKW9coSL"
},
"source": [
"## Data Vizualisation"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 299
},
"id": "b5C4BNmVayKg",
"outputId": "e65267ae-4715-43e3-e6c6-564744cefab0"
},
"source": [
"## 1. Scatterplot\n",
"\n",
"## Colorblind friendly colours\n",
"color = ['cornflowerblue','lightseagreen','steelblue']\n",
"species=['Iris-setosa','Iris-virginica','Iris-versicolor']\n",
"\n",
"for i in range(3):\n",
" x=dataset[dataset['Species']==species[i]]\n",
" plt.scatter(x['SepalLengthCm'],x['SepalWidthCm'],c=color[i],label=species[i])\n",
"plt.xlabel('SepalLength')\n",
"plt.ylabel('SepalWidth')\n",
"plt.legend()"
],
"execution_count": 22,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
""
]
},
"metadata": {},
"execution_count": 22
},
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
"
"
]
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "EilXdyVlkPtH"
},
"source": [
"##### Scatterplot : A scatter plot or graph uses dots to represent values for two different numeric variables to find the relationship between them. Here the relationship between SeapalLength and SepalWidth are shown with repsect to each class of the flower."
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 296
},
"id": "QPX6Yfcqfr1c",
"outputId": "383bd222-6289-4c7f-f63a-3a03bc7b81dc"
},
"source": [
"## 2. Box Plot\n",
"\n",
"## Colourblind friendly pallete : cubehelix\n",
"\n",
"sns.boxplot(x=\"Species\", y=\"PetalLengthCm\", palette=\"cubehelix\", data=dataset)"
],
"execution_count": 30,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
""
]
},
"metadata": {},
"execution_count": 30
},
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
"
"
]
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "aaQ4r5m_l0Bm"
},
"source": [
"##### Box Plot : A boxplot is a graph that gives a good indication of how the values in the data are spread out. They consume less space as compared to Histograms and are also useful for identifying outliers."
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 885
},
"id": "f4zBwaHXh5lM",
"outputId": "2e0e98b4-5377-49ab-9e39-608c253d8891"
},
"source": [
"## 3. Pair Plot\n",
"\n",
"sns.pairplot(dataset, hue=\"Species\", palette=\"cubehelix\", height=3)"
],
"execution_count": 24,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
""
]
},
"metadata": {},
"execution_count": 24
},
{
"output_type": "display_data",
"data": {
"image/png":...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here