Introduction to Biostatistics Assignment 2 Please answer each question in the template document provided and submit via Turnitin on or before the due date. The marks allocated to each question are...

Biostatistics assignment for master degree.


Introduction to Biostatistics Assignment 2 Please answer each question in the template document provided and submit via Turnitin on or before the due date. The marks allocated to each question are shown in the assignment. A total of 30 marks are available and this assignment is worth 30% of your overall grade. Some of the questions in this assignment ask you to analyse the data set assigned to you for assignments. This is the same data set which you used for Assignment 1. Read ‘Description of your data set.docx’ for the descriptions of the variables. Question 1 (5 marks) Note: Each student will get different answers as the data sets differ. Use the assignment data set assigned to you: Variables to analyse: ‘sex’ a. Calculate the point estimate and 95% confidence interval for the proportion of females in the population NSW 17-year-olds using the random sample of NSW 17-year-olds assigned to you. (2 marks) b. Carefully write in words, what the confidence interval in part a. is telling us. (2 marks). c. Are the results in part a. consistent with the statement: “50% of 17-year-olds in NSW are female”? Explain why or why not. (1 mark) Question 2 (7 marks) Note: Each student will get different answers as the data sets differ. Research Question: Is average self-reported hours of moderate to vigorous physical activity (MVPA) per week equal between males and females in the population of NSW 17-year-olds? Use the assignment data set assigned to you: Variables to analyse: ‘MVPA’ and ‘sex’ a. Use appropriate charts and/or statistics to describe the shape of the distribution of self-reported hours of MVPA per week for 17-year-old males and females in the sample. (2 marks) b. Use an appropriate non-parametric test and R Commander to test the hypothesis that the average self-reported hours of MVPA per week is equal between males and females in the population of NSW 17-year-olds. Use R Commander for all calculations but write your answers according to the 5-step method. (5 marks) Question 3 (4 marks) A researcher is questioning whether or not the introduction of new laws intended to limit the emission of polycyclic aromatic hydrocarbons (PAH) in the gasses emitted from aluminium smelting plants have been effective. She has compiled emission measures for a random sample of six aluminium smelters. For each smelter she has recorded emissions at one year before and two years after the introduction of the new legislation. The PAH concentrations are continuous variables. Results are shown in the following table. Smelter PAH concentration prior to introduction of new laws PAH concentration after introduction of new laws A 103 21 B 27 19 C 407 320 D 221 47 E 7,230 550 F 339 28 a. The researcher wishes to test her hypothesis that the concentration of PAH in gaseous emissions from aluminium smelters have decreased since the introduction of the new laws. Is this a one-sided or a two-sided hypothesis test? Explain why. (1 mark) b. Name an appropriate statistical test to address this hypothesis (that the concentration of PAH in gaseous emissions from aluminium smelters had decreased since the introduction of the new laws). Justify your choice of test. DO NOT perform any analysis. (3 marks) Question 4 (9 marks) Note: Each student will get different answers as the data sets differ. Research question: Does mode of transport differ by gender in the population of NSW 17-year-olds? Use the assignment data set assigned to you: Variables to analyse: ‘licence’ and ‘sex’ a. Show the relationship between driver’s licence status and gender in the sample of NSW 17-year-olds using a two-way contingency table. Include either row or column percentages. Type and label the table yourself: an R Commander screenshot will not be accepted. (2 mark) b. Looking at the results in part a) only, is there any evidence of association between gender and licence status in this sample of NSW 17-year-olds? Explain why or why not. (2 marks) c. Are the requirements for a Chi-square test met? Explain why. (1 mark) d. Irrespective of your answer in part c) address the research question using a Chi-square test on the provided data. Please use R Commander but format your answer according to the 5 step method. (4 marks) Question 5 (5 marks) Research question: a. Give one reason why different research studies require different sample sizes. Why not use the same sample size for every research study? (1 mark) b. Dr Smith asks you to estimate the minimum sample size required to detect a difference of 0.5 hour in mean self-reported sedentary hours per week between 17-year-old NSW boys and girls with and power=0.90 . (He suggests, this 0.5 hour difference could, for example, be a mean of 9.5 hours compared to a mean of 10 hours.) He is confident from his previous reading that the population standard deviation is and he wishes to use equal group sizes for maximum efficiency. Estimate the minimum sample size required for Dr Smith’s study. Present your answer to Dr Smith as a sentence which summarises the required sample size to achieve what power subject to what conditions. (3 marks) c. Suppose despite the answer in part b. the Dr Smith decided to run his study with a sample size of n=20 per group (n=40 in total). What impact would this have on the project’s ability to answer the research question? (1 mark) Description of your data set In New South Wales, the minimum age for a licence for learning to drive is 16 years. Learning drivers must always be accompanied by an experienced driver when driving. The minimum age for a licence to drive independently is 17 years. The data set provided to you are (fictitious) data from a survey of a random sample of 17 year olds in New South Wales. There are 271 respondents (individuals) and 7 measurements (variables). The variables are: licence – driver’s licence status (coded as “not licenced” or “learners permit” or “licenced”) sex – gender of respondent (coded as “male” or “female”) activities – number of activities attended in the past month (includes sports, cultural, social and community activities requiring leaving the home, but excludes attending work or school.) transport – Most frequently used mode of transport in the past month (coded as “driver”, “passenger” and “other”: where driver includes both car driver and motor cycle rider; passenger includes being a passenger in a car or on a motor cycle; and other includes public transport, walking, push bike riding, skat boarding, etc.) MVPA – the respondent’s self-reported hours of moderate to vigorous physical activity per week. logMVPA – logarithm transformed MVPA (the logarithm transformation is used to produce a variable containing the same information as MVPA but with a more symmetric distribution). sed – the respondents self-reported sedentary hours per week. Within R Commander the data set is called ‘survey’. There are 271 lines of data. An 8th column has been added to the data set which shows your student number. If this is not your student number, you are working on the wrong data file! Please remember that this is a completely artificial data file. It does not reflect any real world measurements or associations.
Sep 15, 2020
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here