- Questions & Answers
- Accounting
- Computer Science
- Automata or Computationing
- Computer Architecture
- Computer Graphics and Multimedia Applications
- Computer Network Security
- Data Structures
- Database Management System
- Design and Analysis of Algorithms
- Information Technology
- Linux Environment
- Networking
- Operating System
- Software Engineering
- Big Data
- Android
- iOS
- Matlab

- Economics
- Engineering
- Finance
- Thesis
- Management
- Science/Math
- Statistics
- Writing
- Dissertations
- Essays
- Programming
- Healthcare
- Law

- Log in | Sign up

Please do work in an R Markdown file.

Assignment 2 MA 347, Spring 2024 Due End of Day, Sep 29, 2024 Reading – Textbook: read selective sections in Chapter 10 (Pg. 237 - 247) and Chapter 5 (skipping all discussions involving costs). – Read the application paper on logistic regression analysis by Jef- frey S. Morrison. Problems The problems should be written up neatly and any output clearly labeled-if your grader cannot read them, your grader cannot grade them. The final write up should be your own (Good rule of thumb: If a fellow student asks you to explain a problem to them say yes, if they ask how you approached solving the problem say no). Please show R code and relevant R output. If you are using R Markdown, please submit the knitted file (either an HTML page saved as a PDF file or a PDF file). 1. Import the “PredResults.csv” into R, and answer the questions below. This dataset includes a small set of predictive model val- idation results for a classification model, with both actual values and predicted probabilities. (a) Produce a confusion matrix for each cutoff of 0.25, 0.5, and 0.75. (b) Calculate error rate (mis-classification rate), sensitivity and specificity for the three different cutoffs. (c) If the goal is to find the strategy, i.e, setting the cutoff value, which will minimize error rate, what cutoff value will you choose? 1 (d) Create a lift chart. What is the lift of the 2nd 10% of the data according to the gains table? What does it tell you about the classification model compared to a naive model? 2. The file eBayLogistic.csv contains information on 1972 auctions transacted on eBay.com during May-June 2004. The goal is to use these data to build a model that will classify competitive auctions from noncompetitive ones. A competitive auction is defined as an auction with at least one bid placed on the item auctioned. Details of predictors and response are as follows. SellerRating A rating by eBay. Duration Number of days the auction lasted ClosePrice Price item sold (in USD) currency A categorical variable indicating type of currency used in a transac- tion Competitive Whether or not the auction is com- petitive 1 = competitive (yes), 0 = noncompetitive (no) (a) Randomly split the dataset into training and validation sets. Use 60% of the data in your training set. (b) Use the first four variables in the table above as predictors to fit a logistic regression model for this classification problem. (c) Write down the fitted logistic regression model. (d) Describe all of the dummy variables used in the logistic re- gression model. (e) Interpret the estimated logistic regression coefficient associ- ated with the predictors Duration and currency. (f) Produce predicted probabilities of competitive auctions for transactions in your validation set. (g) Produce ROCs on both the training and validation data. (h) Report the AUCs on both the training and validation data. 3. A data mining routine has been applied to a transaction dataset and has classified 88 records as fraudulent (30 correctly so) and 952 as non-fraudulent (920 correctly so). Construct the confusion matrix and calculate the overall error rate (i.e, mis-classification rate). 2 4. Let p denote the probability of event A and 0 ≤ p ≤ 1. Run the following R code and show the plot you obtain. Question: what is the relationship between odds and probability of event A?Hint: how does the odds of event A change when p increases? If odds increases, how does p change? >p<-seq(0,1,by=0.01)>y<-p (1-p)="">plot(p,y,type="l",ylab="odds",lwd=4) 3

Assignment 2 MA 347, Spring 2024 Due End of Day, Sep 29, 2024 Reading – Textbook: read selective sections in Chapter 10 (Pg. 237 - 247) and Chapter 5 (skipping all discussions involving costs). – Read the application paper on logistic regression analysis by Jef- frey S. Morrison. Problems The problems should be written up neatly and any output clearly labeled-if your grader cannot read them, your grader cannot grade them. The final write up should be your own (Good rule of thumb: If a fellow student asks you to explain a problem to them say yes, if they ask how you approached solving the problem say no). Please show R code and relevant R output. If you are using R Markdown, please submit the knitted file (either an HTML page saved as a PDF file or a PDF file). 1. Import the “PredResults.csv” into R, and answer the questions below. This dataset includes a small set of predictive model val- idation results for a classification model, with both actual values and predicted probabilities. (a) Produce a confusion matrix for each cutoff of 0.25, 0.5, and 0.75. (b) Calculate error rate (mis-classification rate), sensitivity and specificity for the three different cutoffs. (c) If the goal is to find the strategy, i.e, setting the cutoff value, which will minimize error rate, what cutoff value will you choose? 1 (d) Create a lift chart. What is the lift of the 2nd 10% of the data according to the gains table? What does it tell you about the classification model compared to a naive model? 2. The file eBayLogistic.csv contains information on 1972 auctions transacted on eBay.com during May-June 2004. The goal is to use these data to build a model that will classify competitive auctions from noncompetitive ones. A competitive auction is defined as an auction with at least one bid placed on the item auctioned. Details of predictors and response are as follows. SellerRating A rating by eBay. Duration Number of days the auction lasted ClosePrice Price item sold (in USD) currency A categorical variable indicating type of currency used in a transac- tion Competitive Whether or not the auction is com- petitive 1 = competitive (yes), 0 = noncompetitive (no) (a) Randomly split the dataset into training and validation sets. Use 60% of the data in your training set. (b) Use the first four variables in the table above as predictors to fit a logistic regression model for this classification problem. (c) Write down the fitted logistic regression model. (d) Describe all of the dummy variables used in the logistic re- gression model. (e) Interpret the estimated logistic regression coefficient associ- ated with the predictors Duration and currency. (f) Produce predicted probabilities of competitive auctions for transactions in your validation set. (g) Produce ROCs on both the training and validation data. (h) Report the AUCs on both the training and validation data. 3. A data mining routine has been applied to a transaction dataset and has classified 88 records as fraudulent (30 correctly so) and 952 as non-fraudulent (920 correctly so). Construct the confusion matrix and calculate the overall error rate (i.e, mis-classification rate). 2 4. Let p denote the probability of event A and 0 ≤ p ≤ 1. Run the following R code and show the plot you obtain. Question: what is the relationship between odds and probability of event A?Hint: how does the odds of event A change when p increases? If odds increases, how does p change? >p<-seq(0,1,by=0.01)>y<-p (1-p)="">plot(p,y,type="l",ylab="odds",lwd=4) 3

Sep 22, 2024

SOLUTION.PDF## Get Answer To This Question

- This assignment is suppose to be an R studio sheet which is then knited into an R markdown file and word doc. The instructions are attached in greater detail. It is for a linear Regression & Analysis...SolvedApr 02, 2024
- LAB 1Due date: Mon 02/26/24 by end of day. Instructions: You are required to work individually. Use R Markdown and Knit to render a pdf or a word document. Upload the document back in...SolvedFeb 26, 2024
- Linear Regression 20%Use Excel to run a linear regression on the Quiz 2 Dataset tabusing donor_weight as the independent variable.Save the regression results on a new sheet in...SolvedOct 27, 2023
- Please cheek the all the sheet of excel file then do the quiz. submit it as early as possible.Oct 27, 2023
- Dear Sir, Read all the sheet in excel sheet then do the quiz. Do solution of 5 sheet and data set also in same of data set. Please do it properly last sheet has some hints. Hopefully i will get the...Oct 27, 2023
- Dear Sir, Read all the sheet in excel sheet then do the quiz. Do solution of 5 sheet and data set also in same of data set. Please do it properly last sheet has some hints. Hopefully i will get the...Oct 27, 2023
- Please follow the instructions. and write down the solution of topic problems along with others part of the assignment. moreover, I write the hypothesis after analysis write down which we accept or...SolvedOct 25, 2023
- 1. Review the powerpoint material for how to calculate2. Do the calculations/analysis for the 4 questions in the excel document (show your work)3. Answer the 4 questions based on the analysis...SolvedSep 14, 2023
- Prof. M. BacolodMN 4761Replication Exercise #2Promotion Policy and Diff-in-diff Estimation1. Read Ahn, T, J Niven, and A Veilleux (2021), How long have you been waiting? Explaining the role of...SolvedJul 15, 2023
- 1ECON 104Summer 2023Assignment 1In this assignment, you will estimate the effect of an effort to increase voter turnoutthrough encouraging phone calls. Voters were randomly assigned to get...SolvedJul 01, 2023

Copy and Paste Your Assignment Here

Copyright © 2024. All rights reserved.