SIT741 Problem Solving Task 2 Unit Chair: Sergiy Shelyag Due: 24 September 2021 Problem Solving Task 2 contributes 25% of your final SIT741 mark (The full mark is 100). It must be completed...

1 answer below »
I would like to get this looked at by someone who is really experienced in R Programming. But NOT the same tutor of Order ID90493.


SIT741 Problem Solving Task 2 Unit Chair: Sergiy Shelyag Due: 24 September 2021 Problem Solving Task 2 contributes 25% of your final SIT741 mark (The full mark is 100). It must be completed individually, and submitted to CloudDeakin before the due date: 8 pm, 24/09/2021 (Week 10 Friday). In this assignment, you will apply your learning to further analyse the 2013-2014 emergency department (ED) demands at Perth and its connection with weather events. This activity builds on Assignment 1; you may want to review your assignment 1 solution and identify any reusable code. Please start early so that you can identify any skill/knowledge gap and seek support from the teaching staff and other students. Application scenario You work in a data science team that tries to model the ED demands in the Perth area to improve the demand prediction. For your convenience, you are provided with the following data links, but you are encouraged to include other relevant data for your analyses. 1. The emergency departments admissions and attendances data set provided by the Department of Health of Western Australia: http://data.gov.au/dataset/emergency-department-admissisons-and-attendances 2. The daily temperature and precipitation data for the region accessible through the NOAA data APIs. https://www.ncdc.noaa.gov/cdo-web/webservices/v2 Of particular relevance is the “Global Historical Climatology Network - Daily” data: https://www.ncdc.noaa.gov/ghcn-daily-description https://www1.ncdc.noaa.gov/pub/data/ghcn/daily/readme.txt Task 1: Source weather data (10 points) From Assignment 1, you have processed data for the ED demands. We still need to find local weather data from the same period. You are encouraged to find weather data online. Besides the NOAA data, you may also use data from the Bureau of Meteorology historical weather observations and statistics. (The NOAA Climate Data might be easier to process.) Answer the following questions: 1. Which data source do you plan to use? Justify your decision. (4 points) 2. From the data source identified, download daily temperature and precipitation data for the region during the relevant time period. (Hint: If you download data from NOAA https://www.ncdc.noaa.gov/cdo-web/, you need to request an NOAA web service token for accessing the data.) (2 points) 3. Answer the following questions: http://data.gov.au/dataset/emergency-department-admissisons-and-attendances https://www.ncdc.noaa.gov/cdo-web/webservices/v2 https://www.ncdc.noaa.gov/ghcn-daily-description https://www1.ncdc.noaa.gov/pub/data/ghcn/daily/readme.txt http://www.bom.gov.au/climate/data-services/station-data.shtml https://www.ncdc.noaa.gov/cdo-web/ How many rows are in the data? (2 points) What time period does the data cover? (2 points) Task 2: Model planning (10 points) Careful planning is essential for a successful modelling effort. Please answer the following planning questions. 1. Model planning: How will the final model be used? (1 point) How will it be relevant to the overcrowding problems at our EDs? (You may find some inspiration here http://bit.ly/2p5qLH6 .) (1 point) Who are the potential users of your model? (1 point) 2. Relationship and data: What relationship do you plan to model or what do you want to predict? (1 pont) What is the response variable? (1 point) What are the predictor variables? (1 point) Will the variables in your model be routinely collected and made available soon enough for prediction? (1 point) As you are likely to build your model on historical data, will the data in the future have similar characteristics? (1 point) 3. What statistical method(s) will be applied to generate the model? Why? (2 points) Task 3: Model the ED demands (30 points) We will start with simple models and gradually make them more complex and improve them. We will focus on the ED demand variable(s) that you defined in Assignment 1. Let’s denote it Y. Randomly pick a hospital from the ED dataset. 1. Which hospital do you pick? (1 point) 2. Fit a linear model for Y using date as the predictor variable. Plot the fitted values and the residuals. Assess the model fit. Is a linear function sufficient for modelling the trend of Y? Support your conclusion with plots. (4 points) 3. As we are not interested in the trend itself, relax the linearity assumption by fitting a generalised additive model (GAM). Assess the model fit. Do you see patterns in the residuals indicating insufficient model fit? (5 points) 4. Augment the model to incorporate the weekly variations. (5 points) 5. Compare the models using the Akaike information criterion (AIC). Report the best-fitted model through coefficient estimates and/or plots. (5 points) 6. Analyse the residuals. Do you see any remaining correlation patterns among the residuals? (4 points) http://bit.ly/2p5qLH6 7. What data type is your day-of-the-week variable? (3 points) Does the data type of this variable affect the model fit? (3 points) Task 4 Heatwaves and ED demands (30 points) The connection between heatwaves and the ED demands is widely reported, as in this news article. http://bit.ly/2kTE4cu In this task, you will try to measure the heatwave and assess its impact on the ED demands. Task 4.1: Measuring heatwave (8 points) 1. John Nairn and Robert Fawcett from the Australian Bureau of Meteorology have proposed a measure for the heatwave, called the excess heat factor (EHF). Read the following article to understand the definition of the EHF. (3 points) https://dx.doi.org/10.3390%2Fijerph120100227 2. Use the NOAA data to calculate the daily EHF values for the Perth area during the relevant time period. Plot the daily EHF values. (5 points) Task 4.2: Models with EHF (7 points) Use the EHF as an additional predictor to augment the model(s) that you fitted before. Report the estimated effect of the EHF on the ED demand. (3 points) Does the extra predictor improve the model fit? (1 point) What conclusions can you draw? (3 points) Task 4.3: Research question - extra weather features (15 points) Can you think of extra weather features that may be more predictive of ED demands? (5 points) Try incorporating your feature into the model and see if it improves the model fit. (10 points) Task 5: Reflection (20 points) In the form of a short report (500-1000 words, 1-2 pages), answer the following questions: 1. We used some historical data to fit regression models. What are the limitations of such data, if any? (5 points) 2. Regression models can be used for 1) understanding a process, or 2) making predictions. In this assignment, do we have reasons to choose one objective over the other? (5 points) How would the decision affect our models? (5 points) 3. Overall, have your analyses answered the questions that you set out to answer? (5 points) What to submit By the due date, you are required to submit the following files to the assignment Dropbox in CloudDeakin. http://bit.ly/2kTE4cu https://dx.doi.org/10.3390%2Fijerph120100227 1. An MS Word or PDF file containing your answers to all the assignment questions. 2. An R Notebook file Assignment2_submission.Rmd containing all your code. The file should be able to run. Include sufficient comments so that the script can be understood by your marker. Indicate all the packages that need to be installed separately. Marking criteria Your submission will be marked using the following criteria. Showing good effort through completed tasks. Applying statistical thinking to understand the problems and to identify solutions. Applying statistical programming skills to obtain data and to process them for data analysis. Applying regression modelling techniques to discover and quantify relationships among variables. Demonstrating creativity and resourcefulness in solutions. Showing attention to details through a good quality assignment report. Bonus mark may be awarded for completing optional tasks. ,Royal Perth Hospital,,,,,,,Fremantle Hospital,,,,,,,Princess Margaret Hospital For Children,,,,,,,King Edward Memorial Hospital For Women,,,,,,,Sir Charles Gairdner Hospital,,,,,,,Armadale/Kelmscott District Memorial Hospital,,,,,,,Swan District Hospital,,,,,,,Rockingham General Hospital,,,,,,,Joondalup Health Campus,,,,,, Date,Attendance,Admissions,Tri_1,Tri_2,Tri_3,Tri_4,Tri_5,Attendance,Admissions,Tri_1,Tri_2,Tri_3,Tri_4,Tri_5,Attendance,Admissions,Tri_1,Tri_2,Tri_3,Tri_4,Tri_5,Attendance,Admissions,Tri_1,Tri_2,Tri_3,Tri_4,Tri_5,Attendance,Admissions,Tri_1,Tri_2,Tri_3,Tri_4,Tri_5,Attendance,Admissions,Tri_1,Tri_2,Tri_3,Tri_4,Tri_5,Attendance,Admissions,Tri_1,Tri_2,Tri_3,Tri_4,Tri_5,Attendance,Admissions,Tri_1,Tri_2,Tri_3,Tri_4,Tri_5,Attendance,Admissions,Tri_1,Tri_2,Tri_3,Tri_4,Tri_5 1-Jul-13,235,99,8,33,89,85,20,155,70,N/A,25,67,54,9,252,59,N/A,13,75,159,4,52,6,N/A,N/A,7,20,25,209,109,N/A,42,108,47,11,166,19,N/A,16,62,79,9,133,17,N/A,29,59,42,N/A,155,10,N/A,12,51,81,11,267,73,N/A,27,75,151,12 2-Jul-13,209,97,N/A,41,73,80,14,145,56,N/A,22,51,47,24,219,47,N/A,14,61,139,4,43,7,N/A,N/A,N/A,25,17,184,112,N/A,43,73,61,5,175,25,N/A,26,73,55,19,129,25,N/A,25,57,43,4,145,24,N/A,26,45,65,8,241,81,N/A,23,78,133,7
Answered 7 days AfterSep 23, 2021

Answer To: SIT741 Problem Solving Task 2 Unit Chair: Sergiy Shelyag Due: 24 September 2021 Problem Solving Task...

Subhanbasha answered on Sep 25 2021
131 Votes
Report
Task 1:
1.
Ans : Here I have planned to use the data from the Bureau of Meteorology historical weather. It has the data of daily, weekly and monthly also in yearly. This data source has contains the required data of the temperature of t
he required area or location. Here we are going to fetch or download the data of temperature of the daily of Perth area as of the requirements. So it leads to do the further analysis using this data.
2.
Ans: Downloaded the data for the Perth area from the website and saved in the csv format.
3.
Ans: There are 550 records or observations in the temperature dataset.
    The time period covered by the data is 2013 July to 2014 December. The data has the one and half year data to do the analysis.
Task 2: Model planning
1.
· Ans: The final model will be used to predict the ED demand by using the date. That means what is the demand of ED by the day. This will be useful to be pre prepare the according to the demand.
· Ans: The waiting of the patients at the emergency department is somehow very painful. This is also not good practice as well. If we do the modeling and predict the emergency department demand by the day then we can take the action against the waiting time of a patient at a hospital. This will lead to the patient to receive the treatment as possible as quick.
· Ans: The potential users of the models are the hospital authorities and the users using the hospital services of emergency department. So, that patient can access the ED treatment as quickly. And also Will get admitted timely.
2.
· Ans: The relationship want to model is that linear relationship if not exist then we can use the non-linear relation. By using this model we can predict the ED demand by the day so that we can pre prepare the requirements as need.
· Ans: The response variable is the Admissions which mean that we are going to predict the Admissions so that this variable also called as dependent variable.
· Ans: The Predictor Date variable is the Admissions which mean that we are going to predict the Admissions based on date so that this variable also called as Independent variable.
· Ans: Yes the variables are available and collected daily to do the modeling.
· Ans: The model on the historical data means the model will learn the pattern of the relationship between the variables to predict the response variable. The data may not be the same characteristics as of the historical data so that the trend...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here