As usual, you have a homework case. This week, we're back to Holmes University, working on a freshmen retention task force. We will build a model to predict whether students will return for their sophomore year, and consider how to use the model to decide which students will receive a costly intervention.
We're back to Holmes University this week. We're on a freshmen retention task force, trying to identify freshmen who are likely to leave Holmes University, i.e. not return for their sophomore year.We've collected the following variables:GPA: The student's GPA in their freshman yearAthlete: =1 if the student is an athlete, =0 otherwiseMiles from home: Distance from campus to the student's homeCollege: College in which the student is enrolled: Education, Business, or Arts and SciencesAccommodations: Home or DormWork Hours: The number of hours the student said they worked at a job during the last week. They could either answer 0, 0-5, 5-10, 10-15, 15-20, or 20+; this has been coded with the midpoint of that range, or 22.5 for 20+. Not perfect, but it's the best we have.Attends office hours: How often does the student say they go to office hours: Never, Sometimes, or RegularlyHS GPA: The student's high school GPAReturn: Dependent variable; =1 if the student returned, =0 if the student did not return.Your sample includes 500 students; of those, 395 return, and 105 do not.
Build a logistic regression model to predict which students will leave/return to Holmes University for their sophomore year.In addition to the variables given, consider polynomial and cross-product terms.Particularly, it looks like GPA, College, and Miles from home are important variables; a polynomial or cross-product involving those variables is useful.Interpret the parameter estimates in your model, including numerical effects or graphical display of effects.What generally makes students more or less likely to leave Holmes University?The retention task force plans to use your model to identify students who are likely to leave. It will place them in a program where they get access to additional services and possibly a small financial incentive to return. The cost of this program is $1,000 per student you identify as likely to leave. Every student who you correctly identify as likely to leave will now be more likely to return: correctly identifying a student as likely to leave gains $4,000 per student.What cutoff probability should you use to identify likely leavers?How much net benefit will this program give the university, based on the 500 students in your sample?Write findings in a case report as usual, and submit it by 11:59pm on Sunday.
https://docs.google.com/spreadsheets/d/1A17gTDA_EK3NN11IHc0d2o20EKu3d_-oE1cFtjig0Ro/edit?usp=sharing