Junction is a small town with two suburbs. The data file “Major Project – Data Set” contains data on 555 housessold in Junction between 2016 and 2021. This data includes the price at which the house...



Junction

is a small town with two suburbs. The data file “Major Project – Data Set” contains data on 555 houses
sold in Junction between 2016 and 2021. This data includes the price at which the house was sold, which of two
agents sold the house (all houses are sold through an agent by law), the year in which the house was sold as well
as data on various characteristics of each house sold (age, size, number of stories etc.). These characteristics serve
as possible explanatory variables of sale price.



Data definitions follow:




OBS

= observation




AGE

= age of house in years




SHOPS

= 1 if house is close to a shopping precinct, 0 otherwise




CRIME

= crime rate of the suburb within which the house is located




TOWN

= distance in kilometres to the town centre




STORIES

= number of dwelling stories




OCEAN

= 1 if house has an ocean view, 0 otherwise




POOL

= 1 if house has a pool, 0 otherwise




PRICE

= price at which the house was sold (in dollars)




AGENT

= selling agent – “W&M” (0) or “A&B” (1)




SIZE

= size of the house in square metres




SUBURB

= Mayfair (0) or Claygate (1)




TENNIS

= 1 if house has a tennis court, 0 otherwise




SOLD

= year of last sale (2016 to 2021)




Your tasks




Task 1 – (recommended length: 1.5 pages)




You are required to provide a comprehensive summary of the data set contained in the “Major Project – Data Set”
file. How you choose to do this is entirely at your discretion. However, it is recommended that you consider using
both summary statistics and graphical methods while also noting any peculiarities within the data set.


Task 2 (including Headline Regression Model) – (recommended length: 3 pages)




You have been hired by Joy, the wealthy owner of a house on Elm Street in Junction (not included in the data set)
to predict the price at which her house will sell. Her house has two stories, is in Claygate, is 178 square metres
large, is not near a shopping precinct and is 10 km from the town centre. She estimates that the house is about 10
years old and in a low crime area according to her experiences. Joy inherited the house from her uncle and is
therefore unsure when it was last sold.





You are expected to build a regression model of house prices. In doing so, make sure that you use an appropriate
number of predictors to develop your estimates. Once you have constructed an appropriate model, use it to obtain
and provide for Joy’s house:


1.
A point prediction of the sales price which it can be expected to fetch


2.
A 95% interval prediction for this sale price


3.
An estimate of the marginal effect of house size on this sale price


4.
Financial advice on whether Joy should use “W&M” or “A&B” to sell her house. “W&M” charges a
commission of 2.5% whereas “A&B” charges a commission of 3.5% of the final sale price.


Joy, who claims to have some knowledge of regression analysis, has stressed that she thinks you should use a
regression model with an R
2


of at least

88%.


Note: Task 1 directed you to take note of any peculiarities in the data set. There are other additional errors in the
data set that you may not have picked up on in Task 1. These will only become clear to you once you start working
on Task 2. Several problems can result if you fail to handle these issues correctly, so be mindful to address them,
both in your regression application as well as your final report. If resolving any of the errors in the dataset requires


you to make assumptions, make sure to clearly state your reasoning and approach in your report.





Task 3 – (recommended length: 1.5 pages)




Please provide a reflective discussion on how you executed

Task 2

of the project above. Specifically consider
the following:


1.
Verify that your regression model does not suffer from any misspecification errors and provide the
relevant regression diagnostics which support your findings.


2.
If you found that your model is in fact partially misspecified in part (1) of Task 3 above, explain what you
did to ensure that the misspecification only has a minimal impact on your results in

Task 2

above. That is,
explain how you corrected any misspecifications that occurred during your modelling.


3.
Were there any other oddities in the data set or your model? Explain.


4.
Is there anything else worth mentioning which is relevant to your work or to your results for Joy?








Task 4 – (recommended length: 1 page)




Sometimes in quantitative research methods, the regression model can be prone to endogeneity problems.





Specifically, the explanatory variable(s) may be influenced by the dependent variable or both may be jointly
influenced by an unmeasured third variable. Given these endogenous relationships, in this task, you need to
discuss another model that can be developed utilizing the given data set. Particularly, you need to provide an
explanation as to what relationship you are trying to explore, what is the underlying reasoning for the relationship,
what variables will be employed in the model, and how exploring this relationship can have practical implications.


Finally, ensure that you provide sufficient discussion on the choice of variables that you wish to include in the
model.


Note: You do not need to execute the empirical model for this task.
Oct 10, 2022
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here