Supervised learning - Instructions have to be followed, and all work has to be shown see below for a full explanation as well I will send the data set once I pay - There are two parts to the...

1 answer below »
Supervised learning - Instructions have to be followed, and all work has to be shown see below for a full explanation as well I will send the data set once I pay -

  1. There are two parts to the submission:


    1. A well commented Jupyter notebook [format - .html]

    2. A presentation as you would present to the top management/business leaders [format - .pdf ](you have to export/save the .pptx file as .pdf)





Description

Background and Context


Best insurance company and My Bank have set up a Bancassurance(Bancassuranceis a relationship between abankand aninsurancecompany), now using the data ofliability customers of My Bank, The Best insurance company wants to convert customers with both a life insurance policy and an account in My bank to loan customers(taking a loan against a life insurance policy)


A campaign that the company ran last year for liability customers showed a healthy conversion rate of over 12.56% success. You are provided with data of customers who have an account in My bank and life insurance policy in the Best insurance company


You as a data scientist at the Best insurance company have to build a model to identify the positively responding customers who have a higher probability of purchasing the insurance. This will increase the success ratio and reduce the cost of the campaign.


Objective



  • To predict whether a liability customer will buy a loan or not.

  • Which variables are most significant for making predictions.

  • Which segment of customers should be targeted more.


Data Dictionary
* CUST_ID: Unique Customer ID
* Target: Field - 1: Responder, 0: Non-Responder
* Age: Customer Age in years
* Gender: Male / Female / Other
* Balance: Monthly Average Balance
* Occupation: Professional / Salaried / Self Employed / SelfEmployed Non-Professional.
* SCR: Marketing Score
* HOLDING_PERIOD: Duration in days to hold the money
* ACC_TYPE: Account Type: Current Account / Saving Account
* ACC_OP_DATE: Account Open Date
* LEN_OF_RLTN_IN_MNTH: Length of Relationship in Months
* NO_OF_L_CR_TXNS: Number of Credit Transactions
* NO_OF_BR_CSH_WDL_DR_TXNS: Branch Cash Withdrawal Debit Transactions
* NO_OF_ATM_DR_TXNS: Number of ATM Debit Transactions
* NO_OF_NET_DR_TXNS: Number of Net Banking Debit Transactions
* NO_OF_MOB_DR_TXNS: Number of Mobile Banking Debit Transactions
* NO_OF_CHQ_DR_TXNS: Number of Cheque Debit Transactions
* FLG_HAS_CC: Has Credit Card - 1: Yes, 0: No
* AMT_ATM_DR: Amount Withdrawn from ATM
* AMT_BR_CSH_WDL_DR: Amount cash withdrawn from Branch
* AMT_CHQ_DR: Amount debited by Cheque Transactions
* AMT_NET_DR: Amount debited by Net Transactions
* AMT_MOB_DR: Amount debited by Mobile Transactions
* FLG_HAS_ANY_CHGS: Flag: Has any banking charges
* FLG_HAS_NOMINEE: Flag: Has Nominee - 1: Yes, 0: No
* FLG_HAS_OLD_LOAN: Flag: Has any earlier loan - 1: Yes, 0: No




Best Practices for Notebook :



  • The notebook should be well-documented, with inline comments explaining the functionality of code and markdown cells containing comments on the observations and insights.

  • The notebook should be run from start to finish in a sequential manner before submission.

  • It is preferable to remove all warnings and errors before submission.

  • The notebook should be submitted as an HTML file (.html) and NOT as a notebook file (.ipynb)



Best Practices for Presentation :


Like in real-world projects, the ultimate destination of any project or work is generally an executive or decision-making meeting, where you are supposed to present your solution to the business problem, based on the project/work you have done. The purpose of this presentation is to simulate that kind of experience and to draw the attention of your audience (a business leader like CMO, COO, CFO, or CEO) to the key points of your project, which are



  • Business Overview of the problem and solution approach

  • Key findings and insights which can drive business decisions

  • Model overview and performance summary

  • Business recommendations


Please keep the following points in mind while making the presentation:



  • Focus on explaining the takeaways in an easy-to-understand manner.

  • The inclusion of the potential benefits of implementing the solution will give you the edge.

  • Copying and pasting from the notebook is not a good idea, and it is better to avoid showing codes unless they are the focal point of your presentation.

  • Please submit the presentation in PDF format only.




Submission Guidelines :



  1. There are two parts to the submission:

    1. A well commented Jupyter notebook [format - .html]

    2. A presentation as you would present to the top management/business leaders [format - .pdf ](you have to export/save the .pptx file as .pdf)



  2. Any assignment found copied/ plagiarized with other groups will not be graded and awarded zero marks

  3. Please ensure timely submission as any submission post-deadlinewill not be accepted for evaluation

  4. Submission will not be evaluated if,

    1. it is submitted post-deadline, or,

    2. more than 2 files are submitted




Happy Learning!!


Scoring guide (Rubric) -My bank


















































CriteriaPoints

Perform an Exploratory Data Analysis on the data
- Univariate analysis - Bivariate analysis - Use appropriate visualizations to identify the patterns and insights - Any other exploratory deep dive
5

Illustrate the insights based on EDA
Key meaningful observations on the relationship between variables
5

Data Pre-processing
Prepare the data for analysis: - Missing value Treatment (if needed) - Outlier Detection(treat, if needed) - Feature Engineering - Data split
4

Model building - Logistic Regression
- Build the model and comment on the model statistics - Test assumptions - Filter out key variables that have a strong relationship with the dependent variable
12

Model performance evaluation and improvement
- Comment on which metric is right for model performance evaluation and why? - Comment on model performance - Can model performance be improved? if yes then do it
5

Model building - Decision Tree
- Build the model and comment on the model statistics - Identify the key variables that have a strong relationship with the dependent variable
5

Model performance evaluation and improvement
- Evaluate the model on appropriate metric - Comment on model performance - Can model performance be improved? if yes then do it
8

Actionable Insights & Recommendations
- Compare decision tree and Logistic regression - Conclude with the key takeaways for the marketing team - What would your advice be on how to do this campaign?
4

Presentation - Overall quality
- Structure and flow - Crispness - Visual appeal - Key insights and recommendations
8

Notebook - Overall
- Conclude with the key takeaways for the business - What would your advice be to grow the business?
4


Answered 1 days AfterNov 05, 2021

Answer To: Supervised learning - Instructions have to be followed, and all work has to be shown see below for a...

Neha answered on Nov 07 2021
120 Votes
95554 - my bank data analysis/bank data analysis.html
In [1]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [3]:

data=pd.read_csv("my-bank.csv") # importing dataset
data.head()


Out[3]:
                CUST_ID        TARGET        AGE        GENDER        BALANCE        OCCUPATION        SCR        HOLDING_PERIOD        ACC_TYPE        ACC_OP_DATE        ...        NO_OF_CHQ_DR_TXNS        FLG_HAS_CC        AMT_ATM_DR        AMT_BR_CSH_WDL_DR        AMT_CHQ_DR        AMT_NET_DR        AMT_MOB_DR        FLG_HAS_ANY_CHGS        FLG_HAS_NOMINEE        FLG_HAS_OLD_LOAN
        0        C7927        0        27        M        3383.75        SELF-EMP        776        30        SA        3/23/2005        ...        0        0        13100        0        0        973557.0        0        0        1        1
        1        C6877        0        47        M        287489.04        SAL        324        28        SA        10-11-2008        ...        0        0        6600        0        0        799813.0        0        1        1        0
        2        C19922        0        40        M        18216.88        SELF-EMP        603        2        SA        4/26/2012        ...        2        0        11200        561120        49320        997570.0        0        1        1        1
        3        C8183        0        53        M        71720.48        SAL        196        13        CA        07-04-2008        ...        4        0        26100        673590        60780        741506.0        71388        0        1        0
        4        C12123        0        36        M        1671622.89        PROF        167        24        SA        12/29/2001        ...        0        0        0        808480        0        0.0        0        0        1        0
5 rows × 26 columns
EDA¶
In [4]:

# Removing unwanted columns
data.drop(["CUST_ID","ACC_OP_DATE"],axis=1,inplace=True)

In [5]:

# Checking the missing values
data.isnull().sum()


Out[5]:
TARGET 0
AGE 0
GENDER 0
BALANCE 0
OCCUPATION 0
SCR 0
HOLDING_PERIOD 0
ACC_TYPE 0
LEN_OF_RLTN_IN_MNTH 0
NO_OF_L_CR_TXNS 0
NO_OF_BR_CSH_WDL_DR_TXNS 0
NO_OF_ATM_DR_TXNS 0
NO_OF_NET_DR_TXNS 0
NO_OF_MOB_DR_TXNS 0
NO_OF_CHQ_DR_TXNS 0
FLG_HAS_CC 0
AMT_ATM_DR 0
AMT_BR_CSH_WDL_DR 0
AMT_CHQ_DR 0
AMT_NET_DR 0
AMT_MOB_DR 0
FLG_HAS_ANY_CHGS 0
FLG_HAS_NOMINEE 0
FLG_HAS_OLD_LOAN 0
dtype: int64
In [6]:

#Categorical Unordered Univariate Analysis
data.OCCUPATION.value_counts(normalize=True)
#plot the bar graph of percentage job categories
data.OCCUPATION.value_counts(normalize=True).plot.barh()
plt.show()



In [7]:

#Categorical Ordered Univariate Analysis
data.ACC_TYPE.value_counts(normalize=True)
#plot the pie chart of education categories
data.ACC_TYPE.value_counts(normalize=True).plot.pie()
plt.show()



In [8]:

#Bivariate Analysis
data.plot.scatter(x="AGE",y="BALANCE")
plt.show()



Graphs for EDA¶
In [9]:

# Age Analysis
plt.figure(figsize=(15,10))
sns.barplot(data.AGE.value_counts().index,data.AGE.value_counts().values) #barplot
plt.xlabel("age")
plt.ylabel("Frequency")
plt.title("Age Analysis")
plt.show()



C:\ProgramData\Anaconda3\lib\site-packages\seaborn\_decorators.py:36: FutureWarning: Pass the following variables as keyword args: x, y. From version 0.12, the only valid positional argument will be `data`, and passing other arguments without an explicit keyword will result in an error or misinterpretation.
warnings.warn(


In [10]:

# Sex Ana
lysis
plt.figure(figsize=(10,7))
sex=data.GENDER.value_counts()
value=(sex[0],sex[1],sex[2])
labels=['Male','Female',"Other"]
plt.pie(sex,colors=['red','green',"yellow"],labels=value) #pie chart
plt.legend(labels)
plt.title("Sex Analysis")
plt.show()



In [11]:

plt.figure(figsize=(7,7))
young=len(data[data.AGE<=45])
middle=len(data[(data.AGE>45)&(data.AGE<=60)])
elder=len(data[(data.AGE>60)&(data.AGE<=100)])
value=(young,middle,elder)
labels=["young","middle","elder"]
plt.pie((young,middle,elder),labels=value)
plt.legend(labels)
plt.title("Dividing Age into 3 catagories")
plt.show()



In [12]:

plt.figure(figsize=(10,10))
s=data.OCCUPATION.value_counts()
labels=["Salaried","Professional","SelfEmployed Non-Professional","Self Employed"]
plt.pie(s,colors=['red','green',"yellow","blue"],labels=labels) #pie chart
plt.legend(labels,loc="best")
plt.title("Different types Occupations")
plt.show()



In [13]:

plt.figure(figsize=[8,8])
ac_type=data.ACC_TYPE.value_counts()
labels=["SA","CA"]
value=[ac_type[0],ac_type[1]]
plt.bar(labels[0],ac_type[0],label=["Saving accound"],color=["green"])
plt.bar(labels[1],ac_type[1],label=["Current accound"],color=["blue"])
plt.legend()
plt.show()



Data preprocessing¶
In [14]:

# Converting categorical variables to numerical variable
data["OCCUPATION"].replace({"PROF":0,"SAL":1,"SELF-EMP":2,"SENP":3},inplace=True)
data["GENDER"].replace({"M":0,"F":1,"O":2},inplace=True)
data["ACC_TYPE"].replace({"SA":0,"CA":1},inplace=True)

In [15]:

data.info() # to check about any NA value in the variables



RangeIndex: 20000 entries, 0 to 19999
Data columns (total 24 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 TARGET 20000 non-null int64
1 AGE 20000 non-null int64
2 GENDER 20000 non-null int64
3 BALANCE 20000 non-null float64
4 OCCUPATION 20000 non-null int64
5 SCR 20000 non-null int64
6 HOLDING_PERIOD 20000 non-null int64
7 ACC_TYPE 20000 non-null int64
8 LEN_OF_RLTN_IN_MNTH 20000 non-null int64
9 NO_OF_L_CR_TXNS 20000 non-null int64
10 NO_OF_BR_CSH_WDL_DR_TXNS 20000 non-null int64
11 NO_OF_ATM_DR_TXNS 20000 non-null int64
12 NO_OF_NET_DR_TXNS 20000 non-null int64
13 NO_OF_MOB_DR_TXNS 20000 non-null int64
14 NO_OF_CHQ_DR_TXNS 20000 non-null int64
15 FLG_HAS_CC 20000 non-null int64
16 AMT_ATM_DR 20000 non-null int64
17 AMT_BR_CSH_WDL_DR 20000 non-null int64
18 AMT_CHQ_DR 20000 non-null int64
19 AMT_NET_DR 20000 non-null float64
20 AMT_MOB_DR 20000 non-null int64
21 FLG_HAS_ANY_CHGS 20000 non-null int64
22 FLG_HAS_NOMINEE 20000 non-null int64
23 FLG_HAS_OLD_LOAN 20000 non-null int64
dtypes: float64(2), int64(22)
memory usage: 3.7 MB
In [16]:

# Dividing data set into dependent and independent variable
ind_var=data.iloc[:,1:]
dep_var=data.iloc[:,0:1]

In [ ]:

# feature scaling
from sklearn.preprocessing import StandardScaler
normalization=StandardScaler()
ind_var=normalization.fit_transform(ind_var) # normalization
print(ind_var)

Logistic Regression¶
In [18]:

# applying classification model
from sklearn.linear_model import LogisticRegression # importing logistic regression model
from sklearn.model_selection import train_test_split # importing test_train library
from sklearn.metrics import confusion_matrix # importing confusion matrix to check our results
model=LogisticRegression()
x_train,x_test,y_train,y_test=train_test_split(ind_var,dep_var,train_size=0.75,random_state=0) # dividing data into training and testing phase
model=model.fit(x_train,y_train) # fitting of the model
model.coef_



C:\ProgramData\Anaconda3\lib\site-packages\sklearn\utils\validation.py:63: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
return f(*args, **kwargs)

Out[18]:
array([[ 0.01772332, -0.07457694, -0.17617968, 0.08932235, 0.19437728,
-0.3871862 , -0.0230113 , -0.09010904, 0.30111464, 0.0258769 ,
0.01675395, -0.05292916, -0.06270942, 0.05437429, 0.29230472,
-0.0211706 , 0.05318017, -0.01812939, 0.05852849, 0.09072257,
0.05290384, 0.02534247, -0.06826344]])
Model evaluation¶
In [ ]:

y_pred=model.predict(x_test) # predict the model with test data
cm=confusion_matrix(y_test,y_pred)
print(cm)

In [20]:

accuracy=model.score(x_test,y_test) # checking accuracy of matrix
print("Acuracy of this model is %f" %accuracy)



Acuracy of this model is 0.869800
Decision Tree¶
In [ ]:

In [25]:

from sklearn.tree import DecisionTreeClassifier
clf = DecisionTreeClassifier()
# Train Decision Tree Classifer
clf = clf.fit(x_train,y_train)

Evaluation¶
In [27]:

y_pred = clf.predict(x_test)
cm=confusion_matrix(y_test,y_pred)
print(cm)



[[4173 177]
[ 133 517]]
In [28]:

accuracy=clf.score(x_test,y_test) # checking accuracy of matrix
print("Acuracy of this model is %f" %accuracy)



Acuracy of this model is 0.938000
In [ ]:

95554 - my bank data analysis/bank data analysis.ipynb
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"id": "b94333ee",
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"import matplotlib.pyplot as plt\n",
"import seaborn as sns"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "2a77ebc6",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
CUST_IDTARGETAGEGENDERBALANCEOCCUPATIONSCRHOLDING_PERIODACC_TYPEACC_OP_DATE...NO_OF_CHQ_DR_TXNSFLG_HAS_CCAMT_ATM_DRAMT_BR_CSH_WDL_DRAMT_CHQ_DRAMT_NET_DRAMT_MOB_DRFLG_HAS_ANY_CHGSFLG_HAS_NOMINEEFLG_HAS_OLD_LOAN
0C7927027M3383.75SELF-EMP77630SA3/23/2005...001310000973557.00011
1C6877047M287489.04SAL32428SA10-11-2008...00660000799813.00110
2C19922040M18216.88SELF-EMP6032SA4/26/2012...201120056112049320997570.00111
3C8183053M71720.48SAL19613CA07-04-2008...402610067359060780741506.071388010
4C12123036M1671622.89PROF16724SA12/29/2001...00080848000.00010
\n",
"

5 rows × 26 columns

\n",
"
"
],
"text/plain": [
" CUST_ID TARGET AGE GENDER BALANCE OCCUPATION SCR HOLDING_PERIOD \\\n",
"0 C7927 0 27 M 3383.75 SELF-EMP 776 30 \n",
"1 C6877 0 47 M 287489.04 SAL 324 28 \n",
"2 C19922 0 40 M 18216.88 SELF-EMP 603 2 \n",
"3 C8183 0 53 M 71720.48 SAL 196 13 \n",
"4 C12123 0 36 M 1671622.89 PROF 167 24 \n",
"\n",
" ACC_TYPE ACC_OP_DATE ... NO_OF_CHQ_DR_TXNS FLG_HAS_CC AMT_ATM_DR \\\n",
"0 SA 3/23/2005 ... 0 0 13100 \n",
"1 SA 10-11-2008 ... 0 0 6600 \n",
"2 SA 4/26/2012 ... 2 0 11200 \n",
"3 CA 07-04-2008 ... 4 0 26100 \n",
"4 SA 12/29/2001 ... 0 0 0 \n",
"\n",
" AMT_BR_CSH_WDL_DR AMT_CHQ_DR AMT_NET_DR AMT_MOB_DR FLG_HAS_ANY_CHGS \\\n",
"0 0 0 973557.0 0 0 \n",
"1 0 0 799813.0 0 1 \n",
"2 561120 49320 997570.0 0 1 \n",
"3 673590 60780 741506.0 71388 0 \n",
"4 808480 0 0.0 0 0 \n",
"\n",
" FLG_HAS_NOMINEE FLG_HAS_OLD_LOAN \n",
"0 1 1 \n",
"1 1 0 \n",
"2 1 1 \n",
"3 1 0 \n",
"4 1 0 \n",
"\n",
"[5 rows x 26 columns]"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data=pd.read_csv(\"my-bank.csv\") # importing dataset\n",
"data.head()"
]
},
{
"cell_type": "markdown",
"id": "4de939fb",
"metadata": {},
"source": [
"# EDA"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "b31c230d",
"metadata": {},
"outputs": [],
"source": [
"# Removing unwanted columns\n",
"data.drop([\"CUST_ID\",\"ACC_OP_DATE\"],axis=1,inplace=True)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "ff296b3c",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"TARGET 0\n",
"AGE 0\n",
"GENDER 0\n",
"BALANCE 0\n",
"OCCUPATION 0\n",
"SCR 0\n",
"HOLDING_PERIOD 0\n",
"ACC_TYPE 0\n",
"LEN_OF_RLTN_IN_MNTH 0\n",
"NO_OF_L_CR_TXNS 0\n",
"NO_OF_BR_CSH_WDL_DR_TXNS 0\n",
"NO_OF_ATM_DR_TXNS 0\n",
"NO_OF_NET_DR_TXNS 0\n",
"NO_OF_MOB_DR_TXNS 0\n",
"NO_OF_CHQ_DR_TXNS 0\n",
"FLG_HAS_CC 0\n",
"AMT_ATM_DR 0\n",
"AMT_BR_CSH_WDL_DR 0\n",
"AMT_CHQ_DR 0\n",
"AMT_NET_DR 0\n",
"AMT_MOB_DR 0\n",
"FLG_HAS_ANY_CHGS 0\n",
"FLG_HAS_NOMINEE 0\n",
"FLG_HAS_OLD_LOAN 0\n",
"dtype: int64"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Checking the missing values\n",
"data.isnull().sum()"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "3e30d096",
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAZcAAAD4CAYAAAAgs6s2AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAAOOUlEQVR4nO3df6zdd13H8efLFZpBZVM7YjMGV9ycgW3UMfkxhrCwKFBwIUEBlzjQpFn4pYEZq8NkiTE2LBoF5Y8GEEmmLMwsogvJQG1Y2BBu493KDB0Di7D9ARWoNJsblLd/3G/d2fW2Pe19337vPX0+kpOd8/1+Pt/zft9v29c+3++956aqkCSp04+MXYAkafYYLpKkdoaLJKmd4SJJame4SJLabRi7gLFs3ry55ubmxi5DktaVPXv2HKiqc4437rQNl7m5Oebn58cuQ5LWlSRfm2acl8UkSe0MF0lSO8NFktTOcJEktTNcJEntDBdJUjvDRZLUznCRJLUzXCRJ7QwXSVI7w0WS1M5wkSS1O20/uHLvgweZ23H72GVIo9u/c9vYJWgGuXKRJLUzXCRJ7QwXSVI7w0WS1M5wkSS1M1wkSe0MF0lSO8NFktTOcJEktZsqXJLckOS+JPcmWUjywiS7k+wbXi8kuXUYe2OS65c5xuGJsQtJ5pYZ85Ek/zEx5q5h+5uTVJJXTIx93bDt9cPrI/Xck+SzSS48ya+JJGmFjvvxL0leDLwGuLSqHk2yGXjysPuaqpqf8r0eqaqtU4z7naq6dZnte4E3Af80vH4jcM+SMddU1XyS7cBNwC9PWZskqdE0K5ctwIGqehSgqg5U1UOrW9ay7gRekORJSTYB5wMLRxn7mWG/JGkE04TLHcB5Se5P8oEkL5vYd/PEJaybjnOcMyfG3naMcTdNjLt5YnsBnwZ+Cbga+MQxjvFaFlc6kqQRHPeyWFUdSvJ84KXAlcAtSXYMu0/lZTGAjwHvBM4C3g38/pL9Nyd5BNgPvGPp5OFy2XaAM552znRVS5JO2FQfuV9Vh4HdwO4ke4FrO948yV8BPwc8VFWvnqKOzye5iMWguj/J0iHHDLuq2gXsAti45YI6+colSccyzQ39C4EfVtWXh01bga8BF630zavqLScx7feA/1npe0uSVs80K5dNwPuTnA38AHiAxUtLt/L4ZShYvOl/1fD8PUl++8gBquoZJ1DTTUneM/H6BZM7q+qTJ3AsSdIIUnV6Xh3auOWC2nLtn41dhjQ6fxOlTkSSPVV12fHG+RP6kqR2hoskqZ3hIklqZ7hIktoZLpKkdoaLJKmd4SJJame4SJLaTfXZYrPo4nPPYt4fHpOkVeHKRZLUznCRJLUzXCRJ7QwXSVI7w0WS1M5wkSS1M1wkSe0MF0lSO8NFktTOcJEktTNcJEntDBdJUjvDRZLUznCRJLUzXCRJ7QwXSVI7w0WS1M5wkSS1M1wkSe0MF0lSO8NFktTOcJEktTNcJEntDBdJUjvDRZLUznCRJLXbMHYBY9n74EHmdtw+dhmSZtT+ndvGLmFUrlwkSe0MF0lSO8NFktTOcJEktTNcJEntDBdJUjvDRZLUznCRJLUzXCRJ7QwXSVK7UcMlyQ1J7ktyb5KFJC9MsjvJvuH1QpJbh7E3Jnk4ydMn5h+aeH54GP/FJB9P8pQxepIkjRguSV4MvAa4tKouAa4Cvj7svqaqtg6P109MOwC8+yiHfGQYfxHwGHDdatUuSTq2MVcuW4ADVfUoQFUdqKqHjjPnw8Abkvz4ccbdCZzfUKMk6SSMGS53AOcluT/JB5K8bGLfzROXxW6a2H6IxYD5raMdNMkG4FXA3mX2bU8yn2T+8MMHm9qQJC012kfuV9WhJM8HXgpcCdySZMew+5qqmj/K1PcBC0n+ZMn2M5MsDM/vBD60zHvuAnYBbNxyQa2wBUnSUYz6+1yq6jCwG9idZC9w7RRzvpvkb4C3Ltn1SFVtbS9SknTCRguXJBcCP6yqLw+btgJfAy6aYvqfAl/gNP5lZ5K0lo15z2UT8NdJ/j3JvcBzgBuHfZP3XD69dGJVHQBuAzaesmolSVMb857LHuDyZXa9/Cjjb1zy+l3AuyZeb2osT5K0Av6EviSpneEiSWpnuEiS2hkukqR2hoskqZ3hIklqZ7hIktqdtj/hfvG5ZzG/c9vYZUjSTHLlIklqZ7hIktoZLpKkdoaLJKmd4SJJame4SJLaGS6SpHaGiySpneEiSWpnuEiS2hkukqR2hoskqZ3hIklqZ7hIktoZLpKkdoaLJKmd4SJJame4SJLaGS6SpHaGiySpneEiSWpnuEiS2hkukqR2hoskqZ3hIklqZ7hIktoZLpKkdhvGLmAsex88yNyO28cuQ5KeYP/ObWOX0MKViySpneEiSWpnuEiS2hkukqR2hoskqZ3hIklqZ7hIktoZLpKkdoaLJKndaOGS5HCShSRfTPLxJE9ZZvs/JDl7Ys5zk/xzkvuTfDnJHyTJsO/NSb41zF1I8tGRWpOk096YK5dHqmprVV0EPAZct8z2bwNvA0hyJvAJYGdV/QzwPOBy4K0Tx7xlmLu1qn79lHUiSXqCtXJZ7E7g/GW23w2cOzz/NeCzVXUHQFU9DLwd2HFKKpQkTW30cEmyAXgVsHfJ9jOAV7C4WgF4LrBnckxVfQXYlORpw6Y3TFwWe8sy77U9yXyS+cMPH+xuRZI0GPNTkc9MsjA8vxP40JLtcyyGyaeG7QHqKMc6sv2Wqnr70d6wqnYBuwA2brngaMeSJK3QWrjnsrWq3lFVj01uB54FPJnhngtwH3DZ5AGSPBs4VFXfO1VFS5KOb/TLYkdTVQeBdwLXJ3kScDNwRZKr4P9u8L8PeO94VUqSlrNmwwWgqv4NuAd4Y1U9AlwNvCfJPhbv0XwB+IsRS5QkLWO0ey5VtWma7VX12onne4GXH2XeR4CPtBUoSTppa3rlIklanwwXSVI7w0WS1M5wkSS1M1wkSe0MF0lSO8NFktTOcJEktRvzgytHdfG5ZzG/c9vYZUjSTHLlIklqZ7hIktoZLpKkdoaLJKmd4SJJame4SJLaGS6SpHaGiySpneEiSWpnuEiS2hkukqR2hoskqZ3hIklqZ7hIktoZLpKkdoaLJKmd4SJJame4SJLaGS6SpHaGiySpneEiSWpnuEiS2hkukqR2hoskqZ3hIklqZ7hIktptGLuAsex98CBzO24fuwxJOqX279x2St7HlYskqZ3hIklqZ7hIktoZLpKkdoaLJKmd4SJJame4SJLaGS6SpHaGiySp3ZoNlyQ3JLkvyb1JFpK8cNi+IcmBJH+8ZPzuJJeNU60kadKaDJckLwZeA1xaVZcAVwFfH3b/IrAP+NUkGalESdIxrMlwAbYAB6rqUYCqOlBVDw373gT8OfCfwItGqk+SdAxrNVzuAM5Lcn+SDyR5GUCSM4FXAP8I/C2LQSNJWmPWZLhU1SHg+cB24FvALUnezOKlsn+pqoeBvwNel+SMaY+bZHuS+STzhx8+uAqVS5JgDX/kflUdBnYDu5PsBa4Fvg+8JMn+YdhPAFcCn57ymLuAXQAbt1xQzSVLkgZrcuWS5MIkF0xs2sriCuYK4JlVNVdVc8Db8NKYJK05a3Xlsgl4f5KzgR8ADwB3AU85cpN/8PfAe5NsHF7fnuT7w/O7q+pXTlXBkqTHrclwqao9wOVTjPs2cM7w8uWrWZMkaXpr8rKYJGl9M1wkSe0MF0lSO8NFktTOcJEktTNcJEntDBdJUjvDRZLUbk3+EOWpcPG5ZzG/c9vYZUjSTHLlIklqZ7hIktoZLpKkdoaLJKmd4SJJame4SJLaGS6SpHaGiySpneEiSWpnuEiS2hkukqR2hoskqZ3hIklql6oau4ZRJPkesG/sOlbJZuDA2EWsglntC2a3t1ntC2a3t+P19ayqOud4BzltP3If2FdVl41dxGpIMj+Lvc1qXzC7vc1qXzC7vXX15WUxSVI7w0WS1O50DpddYxewima1t1ntC2a3t1ntC2a3t5a+Ttsb+pKk1XM6r1wkSavEcJEktZvJcEnyyiT7kjyQZMcy+5PkfcP+e5NcOu3cMa2wr/1J9iZZSDJ/ais/vil6+9kkdyd5NMn1JzJ3TCvsa72fs2uGP4f3JrkryfOmnTumFfa13s/Z1UNfC0nmk1wx7dz/p6pm6gGcAXwFeDbwZOAe4DlLxrwa+CQQ4EXAv047dz32NezbD2weu48V9PZ04OeBPwKuP5G567GvGTlnlwM/Njx/1Qz9PVu2rxk5Z5t4/F78JcCXTvaczeLK5QXAA1X11ap6DPgYcPWSMVcDH61FnwPOTrJlyrljWUlfa91xe6uqb1bVF4Dvn+jcEa2kr7Vumt7uqqrvDC8/Bzxj2rkjWklfa900vR2qIU2ApwI17dylZjFczgW+PvH6G8O2acZMM3csK+kLFv+Q3JFkT5Ltq1blyVnJ1329n7NjmaVz9pssrqpPZu6ptJK+YAbOWZLXJfkScDvwGycyd9IsfvxLltm29PutjzZmmrljWUlfAC+pqoeSPB34VJIvVdVnWis8eSv5uq/3c3YsM3HOklzJ4j/CR67fz8Q5W6YvmIFzVlW3Abcl+QXgD4Grpp07aRZXLt8Azpt4/QzgoSnHTDN3LCvpi6o68t9vArexuMxdK1bydV/v5+yoZuGcJbkE+CBwdVX914nMHclK+pqJc3bEEIo/nWTzic49coCZerC4Gvsq8FM8fuPpuUvGbOOJN74/P+3cddrXU4EfnXh+F/DKsXs6kd4mxt7IE2/or+tzdoy+1v05A54JPABcfrJfl3XW1yycs/N5/Ib+pcCDw78nJ3zORm94lb6IrwbuZ/G7G24Ytl0HXDc8D/CXw/69wGXHmrtWHifbF4vf4XHP8LhvrfU1ZW8/yeL/Pf038N3h+dNm4Jwt29eMnLMPAt8BFobH/LHmrpXHyfY1I+fsd4faF4C7gStO9pz58S+SpHazeM9FkjQyw0WS1M5wkSS1M1wkSe0MF0lSO8NFktTOcJEktftfPkse+Re4SnwAAAAASUVORK5CYII=\n",
"text/plain": [
"
"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"#Categorical Unordered Univariate Analysis\n",
"data.OCCUPATION.value_counts(normalize=True)\n",
"\n",
"#plot the bar graph of percentage job categories\n",
"data.OCCUPATION.value_counts(normalize=True).plot.barh()\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "19d24200",
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAPYAAADnCAYAAAAtmKv2AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAAVy0lEQVR4nO3deZwcZZ3H8c8v13CEhCMc4UpJOAJy+QoBEkBFYHelENlVhMByiIKAuoosWuiuNrqrpeu6ICoquuquGAEFCVTA5RDkeHHIQlAgIGLJkQCJgQZCCEn3s39UzTqEmemeSXf9qp7+vV+vfmWmZ9LPN8d3qrqO5xHnHMYYv4zRDmCM6TwrtjEesmIb4yErtjEesmIb4yErtjEesmIb4yErtjEesmIb4yErtjEesmIb4yErtjEesmIb4yErtjEesmIb4yErtjEesmIb4yErtjEesmIb4yErtjEesmIb4yErtjEesmIb4yErtjEesmIb4yErtjEesmIb46Fx2gFMdwRRMgbYBtghfwTA5sAkYKP81/6PNwQEaOSPVcDK/PEisARYvNbjaeDpNA6bRf2ZTPvEFuWrviBKtgP2zR97AtOBacCELg/9CvAg8MDARxqHy7s8rmnBil0xQZT0AXPyx77ALGCqaqg3egK4BbgRuDGNw6eU8/QcK3YFBFEyDTgCeCfwdrJd5yp5lLzkwA1pHNaV83jPil1SQZTMAOYC7wb2Uo7TSa8B1wOXA7+wkneHFbtEgiiZBBwDnALsrxynCKuAa4D/BhakcbhaOY83rNjKgigRst3rU4C/AzZQDaTnOeA7wEVpHC7RDlN1VmwlQZSMB44HPgXMUI5TJq8BlwEXpHH4G+0wVWXFLlgQJRsApwJnA9spxym7O4CvAVekcWj/UUfAil2QIEo2Bj4K/AMwRTdN5dwHfDqNw+u0g1SFFbvLgigZB5wJfA7YVDlO1d0MnJvG4Z3aQcrOit1FQZQcDvw79h66064i24I/pB2krKzYXRBEyW5k7w3/WjuLx9YAXwXOS+PwVe0wZWPF7qAgStYHPg98HLvBpiiPAaencXijdpAysWJ3SBAlBwHfB3bSztKjfgScncbhn7WDlIEVex3lN2V8ETiL7NZHo2cpcEYahz/XDqLNir0OgijZC/gxsLt2FvM6FwMfS+NwpXYQLVbsUQqi5BTgW0CfdhYzqIeAo3v1yLkVe4TyS0EvAM7QzmJaWgGclsbhT7SDFM2KPQJBlGwF/Aw4QDuLGZFvku2aN7SDFMWK3aYgSmaTlXpr7SxmVBLgmDQOV2gHKYLNUtqGIEreR3Y5o5W6ukLgliBKttQOUgQrdgtBlJwGzKP7EwOa7psJ3JnPTuM1K/Ywgij5FNnN//b35I8AuCO/oMhblfsPKyKfEZEHReQBEblfRPbLnx8nIstE5EudGCeIkhiIO/FapnQ2AX4ZRMnB2kG6pVIHz0RkNtnNFW93zq0SkSnABOfcYhE5HPgMsBWwoxvlHyyfqugi4EOdym1KawXwzjQOb9UO0mlV22JPBZY551YBOOeWOecW51+bS3Z++QnWbSLA87FS94oNgQX5GQ+vVK3Y/wNsJyKPisi3RORtACKyPnAI2YyX88hKPmJBlPwz2QwnpndMBK4NomSWdpBOqtSuOICIjAUOAg4m27JGZLtURznnjheRzYD7gcA51/YFCUGUnE62C2560/PAO9I4vF87SCdUrtgDich7gZOA1WRXg/Vf9L8FcKRz7oZ2Xic/Tz2P6u3BmM56Gtg3jcPFLb+z5Cr1H1lEdhGRgfc77012q96BwPbOucA5FwAfps3d8SBKDiWbsL5SfxemK7YBrsonzKi0Sm2xRWQmcCGwMdnUOI+RTVG7j3Pu2AHftynwCLBt/4G2wQRRsiNwT/56xvS7nOzy0+qUYy2Vmr7HOXcv2SqTrb5vOdla0EMKomQjsknxNu5IOOOTo4GHyWaWraRe3v38IbCbdghTWp8NouTY1t9WTpXaFe+UIErOJpvh0pjhvALMTONwkXaQkeq5YgdRcgDZnVqVehti1CwE9kvjcMhjNWXUU7vi+fvqS7BSm/btRQX37nqq2GT/QNO0Q5jK+UgQJZVa/KFndsWDKDmM7JJUY0ZjMbBHGofLtYO0oye22EGUTAK+p53DVNrWZNdQVEJPFJtsF3x77RCm8o6ryj3c3u+KB1FyCNDWNePGtOEhYO80DldrBxmO11vsIErGkt2jbUyn7Ea26GKpeV1s4IPAm7VDGO98NoiSbbRDDMfbYufnrD+vncN4aSLZFF2l5W2xgU+T3ZdtTDe8L4iSljckafGy2EGUTKMC74NM5dW0AwzFy2IDXwDW0w5hvHdYWbfa3hU7iJKAUU5maMwo1LQDDMa7YgOfwG7yMMU5LL9jsFS8KnYQJVOAD2jnMD2nph1gbV4VG/gIsIF2CNNzDg2iZKZ2iIG8KXYQJRuQFdsYDadrBxjIm2IDpwCbaYcwPWtuECWTtUP086nYp2kHMD1tQ+AE7RD9vCh2vu7SHto5TM8rze64F8Um2w03Rtubgyg5SDsEeFDsIEomAMdo5zAm90HtAOBBsYEQ2EQ7hDG5o4Io6dMO4UOxj9cOYMwAkwD1GU0rXez8J+M7tXMYs5ajtQNUutjA27ArzUz5hEGUqN6vUPViH64dwJhBbEK2ZruaqhfbdsNNWR2pOXhlix1EyXRgZ+0cxgzhEM3Bhy22iEwa5mvaE/Db1tqU2e6a14632mLf3P+BiNy41td+0ekwI3SY8vjGDGcMoDYBQ6tiy4CPNx3maxpmKY9vTCtqB9BaFdsN8fFgnxcmn6x9qtb4xrRJ7brxVufathCRT5Btnfs/Jv98864mG55trU0VzAqipC+Nw1VFD9xqi30xsBHZygf9H/d/rrks7T6KYxvTrj5gb42Bh91iO+fOAxCRKc65ZcVEaosV21TFbsBdRQ/a6nTXESKyFHhARJ4SkbJMjm7FNlUxQ2PQVrviXwQOcs5tDbwH+FL3Iw0viJItsLnNTHWUsthrnHOLAJxzd5G9v9YWaAcwZgRUit3uUfFBP3fOaSwlGiiMacxo7RBEyfg0DlcXOWirYvcfCR/qcw2B8vjGjMQ4YDqwqOhBh/NL59ydhSRpX6AdwJgR2p6Ci93qPfZFIvIdESnNROhYsU31TCl6wFbFngk8DNwjImWZDF37rjJjRqpcxXbONZ1z5wNHAd8QkZdE5MX+X4sIOAg71WWqplzFBhCRDwBXAZ8BJjnnJjnnNnLODXmvdpdtrDSuMaNVeLGHPXgmIncAKdlFKs8UkmgY+eIA62nnMGaECt/LbHVUfJ5z7sJCkrRnonYAY0ah8GK32hV/fyEp2re+dgBjRqHwvcyqTWZoxTZVNLboAVvtiu85xNFvAZzCATTVSdiNGaXSFfu3zrm3FJKkPa9pB/DVRqyoL+w7bTXV24srvQZjVsDyQses2haw8ClmesVLbDj5NcY9tp6s3lE7i2/G0Cjde+zL23kRETm3A1na8WpB4/SkB12wWDuDpxpFD9jqyrMvtvk6Ra0uaMXuoqSxf9X24Kqi0Fs2oXPvp4qaY9yK3UULGvtN187gqReKHrBTxS5kjvE0Dhso7Nb0imfYdMvX3Ng/aefw0HNFD1i1LTbA8wWO1XMedds9qZ3BQ0uLHrBTxW7rIFuHPF3gWD3nusYs7aWbfFSuLbaIfEVETh/k+bNE5Mv9n4/gIFsnPFXgWD3n6ubsadoZPFS6LfYRwHcHef4CIOx8nLZYsbvoT26rbde4MbZX1FmlK7ZzzjUHebKJ3mqb9p+uyx53U+0AWmeVa1cceEVEdlr7yfy5ld2J1JJtsbvs+uY+duahs9KiB2xV7M8C14rIySKyR/54P5DkX9Ngxe6y+Y3Z22pn8MiT1OqFTyPWalG+a0XkKOAc4KP5078D3uOc+22Xsw3lIaVxe8Yjbvs3NZwsHStOc6lkXzyoMWiro+LrAc86505yzs3MHycBz+ZfK1wah0+jcDCi1zzptviDdgZPlK/YwNeBgwZ5/jDgPzofp20LFcfuCTc132K3yHbG7zQGbVXsA51zV6z9pHPuEuCt3YnUFit2l81vzNlKO4MnSrnFHu6UluYN+fcrjt0TFroddmy64m9e8IxD6ZhQq3I+JyL7rv1k/pzm+9z7FcfuCY4xY5aw2aPaOSpuEbX6Co2BW91/ew5wmYj8ELg3f24f4ETg2C7mamURsALYUDGD925p7LXyuHE3aceosl9pDdxqooW7gf3IdslPBk7Kv3QSWblVpHG4BrhNa/xeMb852053rRu1n4ot3yc75551zn0O+BfgcbJSn0e2WJ8mtZ+GveKe5oydneNl7RwV5YCbtQZvtcTPzmS73HOBPwOXAuKcO7iAbK3YPmKXNRg7bimTH9mC+kztLBX0ALX6n7UGb7XFXgQcArzLOXdgvtxPWa4jvpfsh43potubu9sWe3RU9yhbFfs9wDPAr0TkYhE5BL27ul4njcMmcL12Dt/Nb8zZRDtDRanuUbY6eHalc+4YYAbZ+4WzgC1F5CIR+asC8rVyrXYA393e3H0X52wSyRFaScm32AA451Y45y5xzh0BbEt2HjnqZrA2XY2tDtJVrzG+7wUmLtLOUTHXUKurvoUZ8dVjzrnlzrnvOOfe0Y1AI5HG4fPYVrvr7mzuWtfOUDHztAP4sE7TJdoBfDe/MafoxRer7EVggXYIH4p9NfCSdgif3dzcexfnil/NoqKupFZXX2Ou8sVO4/BV4A13oJnOWUnfBi+x/iPaOSrip9oBwINi536iHcB39zZ3tmsGWlsG3KAdAvwp9o2AzazZRVc3Zm+gnaECvk+tvkY7BHhS7HxNrwu1c/js+ubMnZ3jDVNRm/+3BviGdoh+XhQ79z3sIFrXvMSGk1fSZ/dnD+1yavXSzKDrTbHTOKwD/6mdw2cLm9MLn/i+QjTnAHwDb4qduwBsd7Fbrmnu36edoaRuo1a/RzvEQF4VO43DPwK/0M7hq2sbs3bUzlBSpdpag2fFzn1FO4CvljN5s1VuvM03/nqPUcKNiXfFTuPwLkr4F+2LB900WxTx9T5DrV66t3/eFTt3LuWZEMIrCxr7tZoAs5fcTa1+mXaIwXhZ7DQOFwE/0M7howWN/d6knaFEPqkdYCheFjv3OfSW+vXWYqZMXe3GPqGdowQSavVbtEMMxdtip3G4mOz0l+mwR922vV7sBvAp7RDD8bbYuRh4VjuEb65rzCrFvHeKfkStrrImV7u8LnZ+NdpZ2jl8c01z9vbaGRQtpeRba/C82ABpHM4Dfqmdwyd/dFO3W+PGLNHOoeQj1OrLtEO04n2xc2eQrfVlOiR1W/1RO4OCK8p6emttPVHs/FLTMsyq6o3rmzNLcd9xgZYDZ2qHaFdPFDv3TaC0pyeqZn5jzrbaGQr2MWr1yhyI7Zlip3HogBOwZYE64mE3bYemk9K/1+yQa6jVf6wdYiR6ptgAaRw+Sbb8r9PO4oMn3eaPaWcowGLgA9ohRqqnig2QxuEC4MvaOXxwU/Mt6tPsdtka4Bhq9cpNMNFzxc79E/Z+e53Nb8yZqp2hy86lVr9NO8RoiHO9uVcaRMlU4D5gS+0sVSU0m4/3/f1LIkzWztIFl1KrH6sdYrR6dYtNGodLgPdhi/qNmmPMmGfY1McJDu8HTmnnG0VkKxH5qYj8QUQeEpEFIrJz/rWzRORVESn8B1/PFhsgjcNfYwfT1smvG3u+op2hw54DjqJWb/nnEhEBrgRuds5Nd87tBnyav+wFzgXuAf62W2GH0tPFBkjj8FLgH7VzVNVVzTlTtDN0UB34G2r1dhefOBhY7Zz7dv8Tzrn7nXO3ish0YCLZ8Zy5nY86vJ4vNkAah1+jhBPSVcFdzV13cc6Ly3VXAu+iVr9vBL9nd+DeIb42l2w53VuBXURki3XMNyJW7L84G7hUO0TVNBg7bhmTq75g32rgvdTqt3bwNY8Ffuqca5ItGnl0B1+7JSt2Lr8y7STgOu0sVXNHc7cqr8DSBE6kVh/NmtYPAjPXflJE9gR2Aq4XkZSs5IXujluxB0jjcBXwbuAq7SxVclXjgI21M6yDD1Orj3bp25uAPhE5tf8JEZlFNnNPzTkX5I+tgW1EZFoH8ralZ89jDyeIknHAJWSnw0wLE1i96pG+kxChSiuFNIAzqNUvXpcXEZGtgfPJttyvAilwOLCrc27RgO/7GvCsc66Qqx6t2EMIomQs2VpgJ2pnqYL7+k59YBNZsad2jja9ChxHrX6ldpBusV3xIeRL854MfFc5SiXc3ZzxvHaGNvWf0vK21GDFHlYahy6Nww8BNe0sZTe/ccBE7QxtWAK8tczTBneKFbsNaRyeBxxHtgtnBvGr5t4znKPMs6r8HjiAWv0B7SBFsGK3KZ8U8e1k9+eatbzCehu+zPplPZ99FbAvtXrPzNNmxR6BfMG/fYA7tLOU0f82dyrbjCprgE9Sqx9Frf6CdpgiWbFHKL8r7GCyUxx2SmGAa5r7b6CdYYAlwCHU6v+mHUSDne5aB0GUHAr8ENhGOUopTOLl+sK+0zYSUd9g3AzMpVZ/RjmHGu1/gEpL4/AGYE/gcu0sZfAiEyevZILmPGiryO6mOrSXSw22xe6YIEpOBC4EJmln0XTphM//er8xi96qMPStwKnU6mU9gFco22J3SBqH/wXsAfxcO4umpLH/hIKHfJFsIv+3Wan/wrbYXZC/974QmKGdpWhTeGHpb9Y7c/OChrsaOJNa/amCxqsM22J3wYD33ucAVb6lccSWsfHmq9y4x7s8zO+AI6nVj7RSD86K3SVpHK5O4/CrwC5kR84buomK87Cb9nSXXjolu2d+L2r1q7s0hhdsV7wgQZRMJ5vo7gRgvHKcrvrQ2KvvOHf8vDkdfMnngH8Fvk2tbrPKtsGKXbAgSgLgXLI7x4o+0FSIbVi65Pb1PtaJxQSeIVtM8Xxq9Zc78Ho9w4qtJIiS7cjeg58MbKSbpvN+33fCk+Olsd0of/s9wNeBy2wLPTpWbGVBlEwk2z0/g+x0mRcWTIhu323MEweM4LesBn4GfJ1a/c4uxeoZVuwSCaJkX7IVKOZS8QtdPj7uZ7d9fNwVB7bxrQuBy4AfUKsv6XKsnmHFLqEgStYHDgOOAt4FVG5S/uny9BM39p2z/RBfXkh2Ge5l1Oq/LzBWz7Bil1w+99oBZCV/N7CDaqAR+EPf8c+MFbcV2am+e4H5wOXU6j6u91UqVuyKCaJkJ7Kiz8l/3RUQ1VBvtAa4b974L8yfPfbhhcAt1OovaofqJVbsiguiZFNgdv54M7AzsCPFnUpbDjwMPJT/eh9wdxqHvi3WVylWbA/lu+/TyEq+C9nu+xbA5vljE7KDc5MYfGvvyFbIeAFYSnaByMBflwCLgIfTOHy2i38UM0pW7B4WRImQbdmb+cOlcdjUTWU6wYptjIfsJhBjPGTFNsZDVmxjPGTFNsZDVmxjPGTFNsZDVmxjPGTFNsZDVmxjPGTFNsZDVmxjPGTFNsZDVmxjPGTFNsZDVmxjPGTFNsZDVmxjPGTFNsZDVmxjPGTFNsZDVmxjPGTFNsZDVmxjPGTFNsZDVmxjPGTFNsZDVmxjPPR/kaj8v7zi2joAAAAASUVORK5CYII=\n",
"text/plain": [
"
"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"#Categorical Ordered Univariate Analysis\n",
"data.ACC_TYPE.value_counts(normalize=True)\n",
"\n",
"#plot the pie chart of education categories\n",
"data.ACC_TYPE.value_counts(normalize=True).plot.pie()\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "cf272b0f",
"metadata": {},
"outputs": [
{
"data": {
"image/png":...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here