Please read instructions document. This is a continuation of Order #104271, preferably hire same expert or someone that is adept in Data Analytics.

1 answer below »
Please read instructions document. This is a continuation of Order #104271, preferably hire same expert or someone that is adept in Data Analytics.


BTA 350 Analytics Project, Name INSTRUCTIONS NOTES · This is a continuation from order no.: 104271 (Preferably hire same expert or someone that is an expert in Data Analytics). · Please read the previous sections for context in order to complete remaining sections. (Link to the original dataset is in Section 1). · No specific writing format required, however please use American English. · Section 10 consists of making 3-presentation slides (with a voice over). However, if you cannot do a voice over – please provide a script. I need to be able to articulate the slide-presentation. · This is due until June 1st (almost an entire month, so please take your time and do not rush. This is worth a lot of points). TO DO · Please complete the remaining sections: Sections 7-10, starts on page 13. (Each section has a word count minimum, which is approx. 1600 words total). · Additionally, revise the previous sections (Skip Section 1) by implementing Marta’s notes, the course professor, that are on the right-side of the document. (Approx. 400 words for revision). BTA 350 Analytics Project Bank Marketing Data SetComment by Marta Stelmaszak Rosa: You may want to give it a more descriptive title once you’re done with the analysis and recommendations. Steve Gallegos 03/28/2022 Project Section 1 Bank Marketing Data Set: · Bank-additional-full.csv with all examples (41188) and 20 inputs, ordered by date (from May 2008 to November 2010), very close to the data analysed in [Moro et al., 2014] · Link to data set: UCI Machine Learning Repository: Bank Marketing Data Set WEEK PROJECT SECTION DUE BY Week 1 Project Section 1 04/03 Week 2 Project Section 2 04/10 Week 3 Project Section 3 04/17 Week 4 Project Section 4 04/24 Week 5 Project Section 5 05/01 Week 6 Project Section 6 05/08 Week 7 Project Section 7 05/15 Week 8 Project Section 8 05/22 Week 9 Project Section 9 05/29 Week 10 Project Section 10 06/05 Project Section 2: CONTEXT Structured approach of the Bank marketing data set: The referred data set is pertaining to a banking institution which shows the direct phone marketing campaigns being done so at to attract clients in subscribing to term deposits in the bank. This is done so as to increase the deposits and also understand the behavior of the customers in approving the term deposits in the bank. The dataset is originally collected from UCI Machine learning repository. The dataset that we are referring to in this abstract gives us the insights for developing new strategies in the years to come in the field of marketing strategies required for the bank. This helps in the betterment of the customer relationship management as a whole.Comment by Marta Stelmaszak Rosa: You got it from there, but it was collected from the Portuguese bank. The bank client data mentioned here in the dataset has inputs which gives personal, educational, demographic and income level information about the clients. These helps in analyzing the output of the workflow which is nothing but getting the likely yes for a term deposit subscription. The approach for such study would involve detailed analysis of the existing customer database. This would also help in filtering the potential clients who can be more likely to subscribe a term deposit plan in the bank. Developing better strategies for future marketing campaigns is the key aspect of analyzing the data set. This would also ensure better sales and revenue for the institution.Comment by Marta Stelmaszak Rosa: This doesn’t sound specific or robust enough. You want to refer to a methodology for conducting such a project, for example DATA from the textbook, or the structured approach from the lectures. Present the approach here and relate it to this project. Project Section 3: Defining the problem Problem definition: In this section we are testing or analysing the data set in three aspects. In the first problem we test on the marital status of the customer with the outcome variable and in the second major problem we are going to predict the outcome that the clint will subscribe a term deposit or not. In the third and last problem we are going test whether there is any effect of the education of the clint on the term deposit or not.Comment by Marta Stelmaszak Rosa: This is the specific analytics question you’re asking. But what is the problem that the business is dealing with here that your analysis would help with? And why is this relevant for the bank? You need to define the overall problem first before diving deeper into analytics questions. Questions: The questions for the problem defined earlier are given as follows: Question 1: Whether marital status of the clint have any impact on the outcome variable that is marital status and outcome variable are independent or dependent. Question 2: This problem related to the prediction of the outcome variable. We are interest to make a model that will predict whether a clint with certain variables is going to subscribe a term deposit or not. The link of this question with the problem defined is that we are able to classify a clint or customer in subscribed term deposit clint or unsubscribed term deposit clint. Question 3: Whether the education level have any impact on the subscription of the term deposit that is whether both the variables are dependent or not. Business relevance of the problem: The problems defined in the previous sections have a minor or major impact on the business. That is if we are able to classify a customer to subscribed or not then we can give such offers to the customers which will attract them to subscribe the term deposit service of the bank because we know which things will attract them towards the services of the bank. Project Section 4: Data In this section we are going to explain the data set used in the analysis and generate some summary statistics and exploratory data analysis. The description about the data set is given as follows: Age: Age of the clint (numeric)Comment by Marta Stelmaszak Rosa: This is great but can be presented better. How about a table? Job: type of job Marital: marital status Education: Education type Default: has credit in default? Housing: has housing loan? Loan: has personal loan? Contact: contact communication type Month: last contact month of year Day_of_week: last contact day of the week Duration: last contact duration, in seconds (numeric). Campaign: number of contacts performed during this campaign and for this client Pdays: number of days that passed by after the client was last contacted from a previous Previous: number of contacts performed before this campaign and for this client (numeric) Poutcome: outcome of the previous marketing campaign Emp.var.rate: employment variation rate - quarterly indicator (numeric) Cons.price.idx: consumer price index - monthly indicator (numeric) Cons.conf.idx: consumer confidence index - monthly indicator (numeric) Euribor3m: Euribor 3-month rate - daily indicator (numeric) Nr. employed: number of employees - quarterly indicator (numeric) Output variable (desired target): Y - has the client subscribed a term deposit? (Binary: "yes”, “no") The summary statistics table for the data is given as follows: We will explain about 2-3 variables here. The job variable has maximum number of people are in admin department and minimum people are retried. In education most of the people are university students and few are of another category. There are some missing values are also in the variables labelled as unknown. Comment by Marta Stelmaszak Rosa: Why did you select these specific variables? You need to justify your choice. The following are some graphs produced from the data set: This is the bar plot where it describes about the marital status of the clints. From this visual we can say that most of the clints are married followed by single and then divorced peoples. There are few missing values present in the variable. Comment by Marta Stelmaszak Rosa: This plot by itself is not as informative as it could be. Try plotting marital status against the Y variable to see if they’re in any way related. Consider the next boxplot created with age variable with respect to outcome variable: The boxplot tells two things about the variable, the first is the distribution and second is about the outliers present in the variable. Here, in the age variable the distribution is approximately skewed, and outliers are present for both the categories “yes” and “no”. For a good analysis one should need to handle the missing values and the outliers present.Comment by Marta Stelmaszak Rosa: These plots and your analysis are good to support why these variables, but they don’t tell us much about the relationship between the variable and the Y. Project Section 5: Analysis In the first model we are going to do the independence test between the two variables. The two variables which are going to test are explained as follows: Marital: marital status Output variable (desired target): Y - has the client subscribed a term deposit? (Binary: "yes”, “no") For the independence test chi square test is best appropriate test. The level of significance for the test is 0.05. The two-way contingency table is given as follows: table(df$marital, df$y) noyes divorced4136476 married223962532 single99481620 unknown6812 The chi-square test output is given as follows: chisq.test(table(df$marital,df$y)) Pearson's Chi-squared test data: table (df$marital, df$y) X-squared = 122.66, df = 3, p-value < 2.2e-16 the test-statistic value is 122.66 with 3 degrees of freedom with p-value approximately 0.0. comment by marta stelmaszak rosa: great analyses, but the people reading your report won’t understand what this means. follow this sentence with an explanation in words what this output means for the variable you studied. the next statistic model is same independence test between two variables. the wo variables are education and outcome variable y. the explanation of the variables are given as follows: education: education type output variable (desired target): y - has the client subscribed a term deposit? (binary: "yes”, “no") the two-way contingency table is given as follows: table (df$education, df$y) no yes basic.4y 3748 428 basic.6y 2104 188 basic.9y 5572 473 high.school 2.2e-16="" the="" test-statistic="" value="" is="" 122.66="" with="" 3="" degrees="" of="" freedom="" with="" p-value="" approximately="" 0.0.="" comment="" by="" marta="" stelmaszak="" rosa:="" great="" analyses,="" but="" the="" people="" reading="" your="" report="" won’t="" understand="" what="" this="" means.="" follow="" this="" sentence="" with="" an="" explanation="" in="" words="" what="" this="" output="" means="" for="" the="" variable="" you="" studied.="" the="" next="" statistic="" model="" is="" same="" independence="" test="" between="" two="" variables.="" the="" wo="" variables="" are="" education="" and="" outcome="" variable="" y.="" the="" explanation="" of="" the="" variables="" are="" given="" as="" follows:="" education:="" education="" type="" output="" variable="" (desired="" target):="" y="" -="" has="" the="" client="" subscribed="" a="" term="" deposit?="" (binary:="" "yes”,="" “no")="" the="" two-way="" contingency="" table="" is="" given="" as="" follows:="" table="" (df$education,="" df$y)="" no="" yes="" basic.4y="" 3748="" 428="" basic.6y="" 2104="" 188="" basic.9y="" 5572="" 473="">
Answered 15 days AfterMay 05, 2022

Answer To: Please read instructions document. This is a continuation of Order #104271, preferably hire same...

Sathishkumar answered on May 21 2022
88 Votes
BTA 350 Analytics Project, Name
Project Section 7: Interpretation
In this work I have asked some questions to find the term deposit conversion ratio. In this work I have found 3 questions to get to get results of conversion rate for stop deposit the questions are question number one whether the marital status of the clients is affecting the conversion ratio? how many
clients have been converted for term deposit? Whether the education of the clients will affect or improve the conversion rate for term deposit?
For this work, I have downloaded the dataset from internet. I have performed Extract, Transform and Load (ETL) process. In which the data is collected in the form of CSV file and then it is loaded into data frame using Pandas library package. After that, I have used Exploratory data analysis to get answer for above three questions. I have used Chi-Squared test for analysis and predictions. The over all percentage of the married clients are higher than singles. And also, the conversion rate of the married clients are more than single. Singles having higher possibility than divorced. Unknown clients are having very less possibility of conversion rate.
Hence, I have analysed data set in terms of, Is the education will affect the conversion rate? In this analysis I got results based on the dataset, such as the clients who are got degree from university have the higher number of term deposit as 1670 out of 10498. The clients who are all having high school have the term deposit as 1031 out of 8484. Where the illiterate has less count as 4 out of 14. In this analysis, I have used Exploratory data analysis to get statistical report of the dataset. Using this EDA analysis, I have generated mean, variance, standard deviation and quartile etc, I have created some graphs such as bar blot, candle chart etc,
Finally, I have focussed on the prediction for term deposit, for that I have used machine learning technique that is logistics regression, this model is used to predict the term deposit in converted or not. Based on the input data, it has been split for training and testing. Then training and testing data is applied to a logistics regression model. Then predicted results are used to find performance metrics. For entire work I have created two models, they are:
Marital status: The test used is the Chi-square test of independence. This is a non-parameter test. The test-statistical value of the test is 122.66 with 3 degrees of freedom. The p-value for the test point value is approximately 0.0. Therefore, the p-value is less than the level of significance. The null hypothesis is rejected and there is evidence that the two variables are not independent of each other. Client's marital status will have an impact on the termination of the term deposit policy's subscription. A confusing variable has an indirect effect on the end of the final variable. In the above test, the presence of the job of a married or single client will have an impact on their decision to have a subscription.
Education status: The test used is the C-square test of independence. This is a non-parameter test. The test-statistical value of the test is 193.11 with 7 degrees of freedom. The p-value for the test point value is approximately 0.0. Therefore, the p-value is less than the level of significance. The null hypothesis is rejected and there is evidence that the two variables are not independent of each other. The educational status of the customer will have an impact on the termination of the Term Deposit Policy subscription. In a given situation, when the independent variable is the type of education and the result is the variable y, the client's income level will influence his decision to...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here