Outline for project documents: You need to summarize the research question and analysis of regression/generalization results in section 1 (maximum 3 pages), present the results (cut of paste of tables...

1 answer below »
QMB 3301


Outline for project documents: You need to summarize the research question and analysis of regression/generalization results in section 1 (maximum 3 pages), present the results (cut of paste of tables and plots from the Excel results) in section 2, and generalization results (prediction, prediction errors, and MSE from all the test sample points) in section 3. The total number of pages of the whole report can be more than three pages based on your section 2 and 3. I give below the headings as I had discussed in the class. Note: Strictly follow the suggested headings and the order of headings for section1 Cover page Make sure that you have the names of all the group members. Section 1 headings (analysis on results reported in sections 2 & 3): Research question (RQ) What is the RQ you are trying to find the answers for by running the regression? Data source Refer to the project guideline document. Descriptive statistics Report and analyze - mean/standard deviation/skewness/kurtosis/outliers. Report whether the data violated the normality assumption. Correlation analysis Analyze and report two categories of correlations I discussed in the class sessions. Regression results Adjusted R2, F-stat, p-value, estimates of IDV coefficients, regression equation, t-tests for identification of significant IDVs, residual analysis plots, and, normality plot Generalization results on test sample Analyze the results comparing MSEcalibration and MSEtest. Conclusion about Implementation Conclude whether the model is implementable by summarizing the regression and the generalization results in a paragraph Section 2: Present all relevant results that you analyze in section 1 from your regression run on calibration file. Section 3: Present all the test samples with relevant columns for calculating MSEtest. Note:You need to present all the sections in one word file and upload and also bring one copy of printout of the word document in the class. Project: Regression Project Guidelines Data source: SPSS survival manual 4e, 2010 by Julie Pallant. You need to refer to this under heading labeled “data source” in your report. The author obtained the scaled scores of the following variables from an experiment in psychology through a set of survey questions answered by the respondents. Dependent variable (DV): tlifesat: total life satisfaction Independent variable (IDV): toptim: total optimism tmast: total mastery tposaff: total positive affect tnegaff: total negative affect tpstress: total perceived stress sex: 1 if male; 2 if female age: in years Note: You will apply the following guidelines and your class notes on the data given by your instructor. · Split your data sample randomly as shown in the class – Calibration (80%) and Test (20%). Run descriptive tests on your regression dependent variable (DV). Identify outliers if any by “mean ± 3*standard deviation.” Keep the outliers for regressions. · Use analysis of descriptive statistics to summarize the data. Comment on the findings. · Develop an estimated regression equation (regression model) and use that to predict your DV in the test sample. Identify which independent variables are statistically significant. Use variable names from the header row in the data file to write the regression equation. Due: February 27, 2022 by 11:59pm Note: You can submit the word document online and bring a printout to the class. Calibration Sample: · Run correlation on all variables (except sex) on the calibration sample. Analyze. · Run regression on the calibration sample (include sex and all other IDVs). · Write your model equation. Report adjusted R2 and other relevant statistics as discussed in the class. Test Sample: · Predict your DV values on test sample using the regression model from calibration. · Report Generalization mean squared errors (prediction accuracy). Analytical Report: The report will consist of three sections in one single word file: 1. Cover page with all group member’s name 2. (Section 1) Summarize results on three single-spaced pages (maximum). As discussed in the class, summary pages should contain brief descriptions of the following: Problem description or research question, data sources, DV descriptive statistics, correlation, normality assumptions, regression model equation, model estimation and fit statistics, normality plot, residual plots, model generalization, conclusion and recommendation. 3. (Section 2) Cut and paste your results and plots analyzed in section 1 from your Excel file. 4. (Section 3) Data columns for all the regression variables in the test file only, ID, random number, forecast of DV, error and squared error. The MSE test number needs to be there. Do not keep any extra columns that you may have generated to do the project. Thresholds: You will use the thresholds for correlation numbers (weak/moderate/strong) as discussed in the class. You will use ± 1 from 0 to report about your skewness and kurtosis statistics. You will use 0.7 for determination of potential multi-collinearity problem. For outlier detection of the DV you will use “mean ± 3*standard deviation” method. For correlation (could be positive or negative) you will use the rule: absolute value between 0 and 0.3 – weak correlation, between 0.3 and 0.5 moderate correlation and greater than 0.5 strong correlation. Model Generalization: To test the generalization power of your model, you need to split randomly the sample into 80%/20%. You will use 80% data set to run the linear regression. After you run the regression, you will get a model equation. You will also get an estimate of mean squared error (MSE of calibration data). Use the model equation to predict your DV in the test data. Note that your test data already has the actual DV value for each test observation. Compute the test MSE from the actual DV and the predicted DV. If the two MSEs (one from the regression output and one from the test data) are close (i.e. MSEtest is not more than 1.5*MSEcalibration) your model is generalizing. Your task is to report correctly the two MSE numbers and conclude whether the model is generalizing or not. Based on your overall analysis report whether the model is implementable or not. Grading Report: 100 points – Everything discussed in the class should be in the report Note: The instructor will use the rubric template in the following page to grade the project. Rubric Template: Rubrics Item Type of Rubric scale MP = maximum point Lesson Level Learning (LLL Objectives Scores for point rubric 0 1 2 Research Question (RQ) 3 point MP=2 Recall hypothesis testing concepts No RQ in the report OK RQ Good RQ Data Source 3 point MP=2 Learn not to plagiarize No mention of data source With mistakes Correct Data Splitting Continuous MP=10 Learn how to test before implementing a model Descriptive Statistics Continuous MP=6 Recall characteristics of location/spread/skewness/kurtosis statistics Correlation Analysis Continuous MP=10 Recall the learning about how two variables co-vary Regression Analysis and Results Continuous MP=60 demonstrate understanding of the following topics: contrast between categorical and continuous valued independent variables (IDV), contrast between dependent variables (DV) and IDV; apply understanding of research hypotheses to interpret regression results; examine hypothesis test results and model fit Residual Analysis and normality plot Continuous MP=5 Understand whether homoscedasticity assumption and normality assumption are be held true in the data Generalization & conclusion Continuous MP=5 draw inferences and find evidence to support generalization surveydata sexagetmasttposafftnegafftpstresstoptimtlifesat 145213615281923 221223724301920 242253216272326 247203025292427 241162324422121 238214017262126 139213535221930 267173517252320 222203436371020 231264022262432 245274117283035 126273910152334 151243312222224 237223822262228 133193515282313 261264016242731 233253415242218 242201915261814 245253622282119 257193710292423 260233225272530 1231320394695 255283814222727 238213117332830 127223233322020 121233315292733 137262920322418 227214331311829 131152329311616 152264118272920 164253114252519 235203322291816 222173231322323 123253717261832 156203722312413 224183213292218 136172622362022 237231939401914 150263924252732 137272917262722 140254115202531 227223618282128 151273612202630 123202219391417 237273115242935 219244218242125 248143527332114 150212016191727 149164325251723 236213139341420 145284414172722 21812113443138 122253814272215 219192226341623 127243221242629 246233512212426 220202630371514 155233915212419 223253818262024 230153210261511 122233314212528 123214020282123 225244010242429 246174029371730 222184112203034 220212828281816 249222728341612 242213715191835 137263912242024 22014242841146 122244319242022 125232115202125 126243522281427 122284019212826 124203518252125 151263614212725 245192820282123 147193022231620 221182716292315 241254016192431 225253915192529 226223712252329 223213523362331 251263211262419 148223610232024 137234122251830 139183032321613 240192826301912 133203414162625 227262521272011 132244314212329 250183922302520 235203628322417 250173921272114 238283312192628 248153139372019 154213411242319 236242720262412 268284010133035 274234310162634 235174123312524 237254512132729 239203426241914 150172620331512 13322321322149 223213919311927 154163733292019 223193730262322 241234026322818 249203715282430 122223215232125 222173020312023 127254133233023 232262113232317 238243220252120 241234118252826 123223921232433 121243322212626 24318163942208 140253519232425 231193120311922 149274212242723 252194413292834 236193031332223 21914223644179 258263215312724 265284713273030 122233422262420 143253612292116 122224422292016 222254313182929 131192719321718 146243313251717 253162623362611 224272917331718 242232329352113 245201623281630 270153224383010 146284110172931 142224021252421 13517112034165 234263725162831 236233726332230 244233417301818 131223518231926 229284510212826 146193119311622 223182820312122 249253912252830 244223110202929 223274015182932 142213315252729 229233817232324 166274210192435 221213519252226 241283525302627 135183219322425 228172628341629 136262517242420 136273515282721 227144023242127 170232910192532 235283719222627 133212616262225 235193425322510 122183620302717 156253316202030 223213022281720 131284515262927 249272812203028 231273414253030 163243010152620 136274316222933 248163122322017 230192930362125 226274613232531 221182935362124 120192623302422 126162931291727 222243125291724 221194529282123 244283823262623 126244220322112 226202212291920 132232717262317 241284012233027 240203416251827 244234013222224 248253915252729 218233324262121 154203320292224 147233415222121 274223012232326 165193710202018 145202717302617 221262220212627 248203121292119 133192611222731 244213712212416 145202717302642
Answered Same DayFeb 28, 2022

Answer To: Outline for project documents: You need to summarize the research question and analysis of...

Suraj answered on Feb 28 2022
106 Votes
Section 1:
Research question:
The researcher is interested to test whether life satisfaction affected by different aspects of the human behavior.
It there any variable which is significantly making effect on the life satisfaction score of the person. Are there any outliers or any missing values present in the data set?
Data set: The data is collected from an experiment in psychology through a set of survey questions answered by the respondents. The data set consist of total 8 different variables. Out of those 8 variables, there is one dependent variable and rest all the independent variables. The description about the variables is given as follows:
Dependent variable (DV):
tlifesat:     total life satisfaction
Independent variable (IDV):
toptim:     total optimism
tmast:         total mastery
tposaff:    total positive affect
tnegaff:    total negative affect
tpstress:    total perceived stress
sex:         1 if male; 2 if female
age:        in years
Descriptive Statistics Analysis:
The descriptive statistics is calculated for all the variables. There is one categorical variable, for that variables we will measure sample proportions or counts of unique values. For dependent variable we will check the assumption of normality by plotting histogram and also using the descriptive statistics table. The descriptive statistics table is presented in the section 2. The mean for dependent variable is 23.01 and the standard deviation is 6.93 that is the scores are deviated from the mean by...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here