The hsb data in the faraway package (Inside R programming) is a subset of the High School and Beyond study. The variables are school type, socioeconomic status, school type, chosen high school program...

1 answer below »



The hsb data in the faraway package (Inside R programming) is a subset of the High School and Beyond study. The variables are school type, socioeconomic status, school type, chosen high school program type, scores in certain classes.














1) By looking at the variables, present any tables and plots you may find relevant to this analysis.













2) The goal is to fit a model that explains a persons program choice type based on the observed variables. Investigate the coefficients in details and comment on any interesting findings.






















3) This is a multinomial model with program type as a response (3 levels). Build the confusion matrix and explain it. What is the accuracy of your model and how can we improve it?
























Discuss in details for all the questions mentioned above.


Answered 8 days AfterFeb 04, 2023

Answer To: The hsb data in the faraway package (Inside R programming) is a subset of the High School and Beyond...

Amar Kumar answered on Feb 06 2023
33 Votes
Q1.
The High School and Beyond research project at the National Center for Education Statistics is where the data for this study were gathered. The characteristics considered include the student's gender, race, socioeconomic status, school type, chosen high school programme type, and exam scores in reading, writing, arithmetic, science, and social studies. The purpose of the study is to examine the correlation between several variables and the student's choice of high school curriculum (academic, vocational, or general)
. The response to this study is classified into three categories.
data(hsb)
head(hsb,n=20)
a. Utilize a trinomial model-based and include additional pertinent variables as predictors, without transforming them.
m1 <- multinom(prog ~ gender + race + ses + schtyp + read + write + math ++science + socst, hsb, trace = FALSE) summary(m1)
b. By using the backward elimination method, the model can be simplified to only include predictors that have statistical significance. Once this is done, the significance of the resulting model can be interpreted.
m2 <- step(m1, scope=~., direction="backward", trace = FALSE
summary(m2)
(dev<-deviance(m2)-deviance(m1))
## [1] 9.680566
pchisq(dev,m1$math-m2$math,lower=F)
## numeric(0)
c. Calculate the forecasted likelihoods of the three available options for the student with ID number 99.
predict(m2,type="probs")[99,]
## academic general vocation
## 0.1877186 0.3566929 0.4555884
sprog<-hsb$prog matplot(prop.table(table(hsb$math,sprog),1),type="l",xlab="Math",ylab="Proportion",lty=c(1,2,5))
matplot(prop.table(table(hsb$science,sprog),1),type="l",xlab="Science",ylab="Proportion",lty=c(1,2,5))
matplot(prop.table(table(hsb$socst,sprog),1),type="l",xlab="Social",ylab="Proportion",lty=c(1,2,5))
Q2:
In order to take into consideration, the model's additional independent variables Linear regression describes the relationship between the variables and the independent variables as a straight line equation with coefficients that indicate the effect on the dependent variable for a change in independent of one unit. Finding the coefficients that reduce the discrepancy between the dependent variable's actual and expected values is the aim of linear regression.
The quantity and importance of the model's coefficients should be carefully considered since they shed light on the connections between the dependent and independent variables. The strength of the association is determined by the coefficient's value, and a substantial coefficient suggests that the relationship is probably not the result of chance.
Hypothesis testing, which compares a null hypothesis with an alternative hypothesis, is a popular technique for determining the significance of the coefficients. According to the null hypothesis, the coefficient is equal to zero, demonstrating that the predictor and the dependent variable have no connection. The alternative theory states that the predictor is connected to the dependent variable if the coefficient is not equal to zero. A p-value is calculated and represents the probability of seeing the predicted coefficient if It is true that the null hypothesis exists. If the p-value is below the chosen level of significance, the research hypothesis is rejected and the rejection of the null hypothesis, which is typically 0.05. meaning that the coefficient is significantly different from zero and the corresponding predictor is important in explaining the variation in the dependent variable.
Here are some possible interesting findings:
School type: The link between a student's programme of choice and the type of school they attend may be understood using the coefficient for school type. If the school type coefficient is substantial and positive, it can mean that students from particular school types are more likely to enrol in particular high school programmes. For instance, if the private school coefficient is positive, it might imply that students from private schools are more likely to select particular high school programmes, whereas the public school coefficient might imply that students from public schools are less likely to select particular high school programmes.
The nature of the education provided at the institution, the resources made accessible to students, or the kinds of programmes it offers might all play a role in this connection. Private schools, for instance, could provide more specialised programmes and have more resources, making them more alluring to pupils with a particular interest. Public schools, on the other hand, could not have the same amount of resources and might have fewer specialised programmes, making them less alluring to kids who are interested in certain fields.
Socioeconomic status: The association between a student's background and programme preference can be revealed by the socioeconomic status coefficient. Students from richer socioeconomic backgrounds may be more inclined to select particular high school programmes if the socioeconomic status coefficient is significant and positive. In contrast, if the coefficient is negative and significant, it may indicate that kids from poorer socioeconomic backgrounds are less likely to enrol in particular high school...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here