Can the same expert who did order 64806 for me please do this one as well?
1 Causal Inference (CSI) Assignment 1 Question 1 Statins are a drug commonly used to reduce levels of “low-density lipoprotein” (LDL) cholesterol in the blood. They are commonly prescribed to people with high levels of LDL cholesterol. Researchers are interested in the effect of statin use on the risk of death from all causes in elderly, but otherwise healthy, patients. Does taking statins have survival benefits even if you don’t have high levels of LDL cholesterol? Specifically, the research question is: does taking statins daily for five years decrease the risk of death from any cause in people aged 65 years or older who do not have high cholesterol or any comorbidities and have never taken statins in the past? (a) Define the causal effect of interest in words and using potential outcomes notation. (b) Define a target trial aimed at estimating the causal effect in part (a). Include a description of: • the eligibility criteria, including the target population; • ‘treatment strategies’: specify the treatment and control group; • treatment assignment procedures; • the outcome measure; • the follow-up period. This description should not exceed half a page in total. Note in particular that we do not require a medical definition of high cholesterol or detailed inclusion/exclusion criteria. (c) Researchers had access to data from a large observational study containing data on 10,000 people aged 65 years that was linked to death records. At the time of their inclusion in the study, participants were asked whether they were taking statins daily, their comorbidities and demographic characteristics were recorded, and a measure of their LDL cholesterol was taken. Dates of death were also included for those participants who died within five years of entry to the study. The causal effect of interest is the effect of taking statins at the time of recruitment to the study versus not taking statins at time of recruitment on risk of death over 5 years. Address the following: 2 (i) : Draw a causal diagram that represents the causal relationships between statin use, LDL cholesterol level, death within 5 years, demographic characteristics and comorbidities. Include “comorbidities” as one node on the diagram, and “demographic characteristics” as another node on the diagram. Be sure to include nodes for unmeasured variables if necessary. [Hint: think about the timing of the measurement of LDL cholesterol in relation to the timing of the use of statins.] (ii) : State in notation and in words the conditional exchangeability as- sumption in the context of the causal effect defined in part (c). (iii) : Using your causal diagram, discuss whether conditional exchange-ability is likely to hold, and if not, what would be needed for it to hold. (iv) []: State the consistency assumption, in notation and in words in the context of the causal effect from part (c). (v) []: State the positivity assumption, in notation and in words in the context of the causal effect from part (c). (vi) [: Are the consistency and positivity assumptions likely to hold for this example? Why or why not? (vii) : Consider now the causal effect defined in part (a). Can the causal effect from part (a) be estimated from this dataset? Why or why not? You do not need to state how you would estimate the causal effect. 3 Question 2 In this question we consider a causal diagram and research question inspired by Sta- plin et al., Use of causal diagrams to inform the design and interpretation of observa- tional studies: an example from the Study of Heart and Renal Protection (SHARP), Clinical Journal of the American Society of Epidemiology, 2017, 12:546-552. The research question we consider is whether smoking has an effect on progression to end stage renal disease among patients who have established renal disease. We consider a dataset simulated to mimic some of the key features of the SHARP dataset, and a simplified version of the causal diagram from the Staplin et al. paper, shown below. Smoking status BMI Blood pressure Age, Sex, Ethnicity Diabetes, angina, PAD, CVD End-stage renal disease PAD = peripheral artery disease; CVD = cerbrovascular disease (a) Using the SHARPmimic.csv dataset (and referring to the SHARPmimicREADME.txt file for an explanation of variables in the dataset), estimate the causal relative risk for the effect of smoking on the occurrence of end-stage renal disease using an inverse probability weighting approach. Write a report on this analysis (2 pages maximum length, including up to two figures and one table), being sure to include the following (not necessarily in this order): • A discussion of which variables should be adjusted for in order to estimate the causal effect of smoking status on end-stage renal disease, considering the causal diagram shown above. • Standardised differences of the confounders, both before and after weighting. Be sure to comment on whether balance has been improved by weighting the sample. 4 • Discussion of the common support condition. Do you think that any partici- pants need to be excluded to satisfy this condition? • Estimates and confidence intervals for the relative risk, obtained via both standard regression modelling, and using inverse probability weighting. What do you conclude about the effect of smoking on end-stage renal disease? How sensitive are your results to unmeasured confounding? • Include your code in an appendix, as well as any additional tables and figures. This appendix does not count towards the 2 page limit. (b) In inverse probability weighting, it can frequently occur that some participants have very large weights compared to most participants. One method that may be em- ployed to deal with very large weights is weight truncation, where all weights that exceed thresholds are set equal to that threshold. Considering the inverse prob- ability weights that you generated in your analysis of this dataset, and without excluding any participants (i.e. for this question do not exclude participants to satisfy common support), generate three sets of truncated weights, by setting all weights greater to the 100 × pth percentile equal to the 100 × pth percentile, and all weights lower than the 100 × (1 − p)th percentile equal to the 100 × (1 − p)th percentile, for p = 0.99, 0.95, and 0.9 (where these percentiles are calculated using all weights). • For each set of truncated weights, calculate the standardised differences of con- founders using these truncated weights. Present these in a table with one col- umn for the original (unweighted) sample, one column using the untruncated weights, and one each for p = 0.99, 0.95, and 0.9. How do the standardised differences change with increasing levels of truncation? • For each set of truncated weights, calculate the relative risk of smoking on end-stage renal disease. Comment on how the relative risk changes with the increasing levels of truncation. (c) Consider the extended causal diagram below (overleaf). 5 Smoking status BMI Blood pressure Age, Sex, Ethnicity Diabetes, angina, PAD, CVD End-stage renal disease ACR U PAD = peripheral artery disease; CVD = cerbrovascular disease; ACR= albumin to creatinine ratio; U = unmeasured confounders. In previous analyses aimed at estimating the causal effect of smoking on the risk of end stage renal disease, researchers have adjusted for ACR (both using a standard regression approach and using propensity score-based methods). Comment on this analysis approach. SHARPmimic.csv variable description ---------------------------------------------- VariableDescription idPatient ID number exposureExposure variable: 0=nonsmoker; 1=smoker sexPatient sex: 0=female; 1=male agePatient age in years ethnicityPatient ethnicity: categorical variable with 5 levels anginaIndicator for angina: 0=absent; 1=present padIndicator for peripheral artery disease: 0=absent; 1=present cvdIndicator for cerebrovascular disease: 0=absent; 1=present diabetesIndicator for diabetes: 0=absent; 1=present sysbpSystolic blood pressure diasbpDiastolic blood pressure bmiBMI, body mass index drinksalcoholDoes the patient drink alcohol? 0=no; 1=yes outcomeOutcome variable: 0=no end stage renal disease; 1=end stage renal disease ----------------------------------------------