Module 3 Time-invariant exposures and propensity scores Module 3 objectives At the end of this module students should be able to: • Understand when it is appropriate to use regression models to...

1 answer below »
Please note this unit is called Causal Statistical Inference




Module 3 Time-invariant exposures and propensity scores Module 3 objectives At the end of this module students should be able to: • Understand when it is appropriate to use regression models to estimate average causal effects of exposures, when exposure does not change over time, and how to do so; • Explain the meaning of the “propensity score” and its covariate balancing properties; • Be able to calculate propensity scores, use these to calculate inverse probability weights, and explain the concept of the pseudo-population and its importance in inverse probability weighting; • Use inverse probability weighting to estimate average causal effects, and un- derstand the underlying assumptions of this method; • Assess the balance of characteristics between groups using standardised dif- ferences, both before weighting and after weighting using inverse probability 76 Module 3. Time-invariant exposures and propensity scores 77 weights; • Have awareness of different propensity score-based methods for estimating average causal effects and be able to explain the key differences between the methods; • Apply the e-value method for assessing the impact of a binary unmeasured confounder. In this module we discuss methods for estimating average causal effects. This takes us from the theory presented in the first two modules of the subject, where we discussed the concepts and fundamental assumptions underlying causal inference and causal thinking, and how to represent underlying causal structures using causal diagrams, towards practical application of methods for estimating causal effects. In this module you will learn how to apply techniques to estimate average causal effects. We discuss the conditions under which regression modelling can be used to estimate average causal effects, and discuss propensity score-based methods. We focus on inverse probability weighting, but do touch on some other methods. Finally, we discuss the e-value approach for assessing the impact of a binary unmeasured confounder on estimates: since we can never be certain that all confounders have been measured and adjusted for appropriately, such approaches are necessary for assessing the sensitivity of our conclusions to this assumption. 3.1 Introduction: setting up the problem As we have learnt in Modules 1 and 2, when analysing data there is almost always a causal question in mind, which must be stated precisely. In Module 3 we will learn how to estimate causal effects using regression models and “propensity score”-based methods, focusing on the estimation of Average Causal Effects (ACE), which were described in Module 1. In this Module we will consider estimating the effect of a time-invariant binary exposure on some outcome: that is, we consider situations where a study participant can be defined as “exposed” or “not exposed”, and this Module 3. Time-invariant exposures and propensity scores 78 definition does not change over time. An example that we will consider throughout this module is the asthma example, reported by Williamson et al. in Reading 3.1 of this module. Using a subset of data from the Tasmanian Longitudinal Health Study, these researchers investigated the question of, among adults who were reported as having asthma as children, whether personal smoking had an effect on whether a participant did not have adult asthma. In 1968, parents were asked to state whether their children had asthma. In 2004, these children were followed-up and were classified as being exposed (“smokers”) if they had a history of smoking at the time of the survey in 2004, and whether they still had asthma or not. This is of course an over-simplification! The outcome in the study was asthma remission: if a participant no longer had asthma as an adult, they were regarded as being in remission. As stated in that key reading, the results of this analysis are for illustrative purposes only. Here the causal effect of interest is the effect of smoking on the presence of adult asthma, among adults who had asthma as children. The “gold standard” for as- sessing whether smoking leads to a higher or lower risk of asthma remission among participants who reported having asthma as children would be a well-conducted ran- domised experiment, where eligible patients are randomised to either smoke or not. However, randomised experiments to answer this question cannot be conducted: we cannot randomise people to receive exposures that we know are harmful. Instead, re- searchers had access to a observational dataset, where whether a participant smoked was not randomised and was instead the personal choice of the participant. This observational study recorded information about participants, including their sex, whether their parents smoked, and whether they had chronic bronchitis, and most of the other variables shown in Figure 3.1 (underlying atopy and parental asthma were not observed). If the exposure (smoking) were randomised, we would expect that, on average, the participants who did receive the exposure (i.e. did smoke) would be similar to the participants who did not receive the exposure (i.e. did not smoke). As we have already seen in Modules 1 and 2, when exposure is not randomised, there is no such Module 3. Time-invariant exposures and propensity scores 79 Personal Smoking Adult Asthma SES Parental asthma Parental smoking Chronic bronchitis Childhood asthma Underlying atopy Sex Figure 3.1: A causal diagram for the adult asthma example. The nodes shaded in gray are the characteristics that need to be adjusted for to close all open paths from personal smoking to adult asthma. SES is socio economic status. For this example, underlying atopy and parental asthma are unmeasured: to represent this, we have not drawn boxes around these variables in the diagram. guarantee. Participants chose for themselves whether to smoke or not and hence there may be confounding of the relationship between the exposure (smoking) and the outcome (adult asthma). How do we know which set of variables we need to control for in order to estimate the causal effect of smoking on adult asthma? Recall the causal diagrams of Module 2! The researchers, in consultation with respiratory experts, developed the causal diagram in Figure 3.1. This causal diagram includes measured and unmeasured confounders: here, underlying atopy and parental asthma are unmeasured, while all other variables have been measured. By applying the rules discussed in Module 2, those nodes shaded in gray are the characteristics that need to be adjusted for in order to unbiasedly estimate the effect of personal smoking on asthma remission. You can verify this for yourself by applying the rules discussed in Module 2. Module 3. Time-invariant exposures and propensity scores 80 Exercise 3.1 Consider the causal diagram in Figure 3.1. Is the backdoor path from personal smoking to adult asthma that goes from parental smoking through sex and underlying atopy (only through these two variables) open or closed? What if we did not adjust for sex? The problem in the asthma example is that participants who did smoke may be different (in both measured and unmeasured ways!) to those participants who did not smoke. Perhaps people are less likely to smoke if they have chronic bronchitis, or maybe men are more likely to smoke than women. It is thus possible that when we estimate the effect of smoking on adult asthma (i.e. obtain an estimate of the Average Causal Effect), that any difference we observe between these exposure groups is due to these differences between the characteristics of participants who do and do not smoke, rather than due to the effect of smoking itself. Hence, when we compare the outcomes of participants who did and did not smoke, we want these groups of participants to look as similar as possible, just as we would expect them to look had they been randomised to smoke or not to smoke. We will first consider measured characteristics of participants, and worry later about the impact of unmeasured characteristics. There are many different approaches to getting our exposure groups to look as sim- ilar as possible. One standard approach, which you will have already seen and applied in other subjects, is the regression approach where baseline characteristics of subjects in the study are included in a regression model for the outcome, along with the exposure. We will review the regression approach, and discuss when this approach will produce valid estimates of causal effects. We will also consider propen- sity score-based approaches. Briefly, the propensity score is the probability that a particular subject has received the exposure given the set of variables necessary to eliminate confounding bias between the exposure and the outcome (i.e. the proba- bility that a subject is exposed given this set of variables). Recall from Module 2 that confounding bias arises from open backdoor paths between the exposure and Module 3. Time-invariant exposures and propensity scores 81 the outcome: we need to adjust for those variables that close open backdoor paths. The first reading provides an overview of the topics that will be covered in this module and some further detail on the asthma and smoking example. You may want to read the entire article now, or dip in and out as you read through the Module notes. Reading 3.1 Williamson EJ, Forbes AB (2014). Introduction to propensity scores. Respirology, 19, 625-635. 3.2 The Average Causal Effect Before discussing techniques for estimating the average causal effect, we review the definition of the average causal effect. This is a contrast (or, more simply put, a comparison) between the average potential outcomes that would have been observed had all members of our population been exposed, and had all members of our population been unexposed. As described in Section 1.7, this contrast could be a difference between the averages of the outcome in the two groups, a risk difference, a risk ratio, or any other quantity that compares two groups. The key is that all members of the population are both exposed and unexposed: when this is the case, there will be no confounding of the exposure-outcome relationship. Due to the fundamental problem of causal inference - that each subject in our sample is either exposed or unexposed, but will never be observed under both exposures - we must estimate the average causal effect using data where some subjects are exposed and others unexposed. The key to making causal statements about the effect of an exposure on an outcome is knowing when it is possible to unbiasedly compare exposed an unexposed subjects, and how to do so. Before we get into the details, a note on terminology. In Modules 1 and 2, and in the Hernán and Robins text, the presentation is in terms of “treatments” rather Module 3. Time-invariant exposures and propensity scores 82 than “exposures”. Here we use the term “exposure”, but all that follows holds if the word “treatment” is instead substituted. Let Yi denote the outcome for subject i = 1, . . . , n (i.e. we have data from n individuals), and Ai denote the exposure for subject i, with Ai = 0 indicating that the subject is unexposed and Ai = 1 indicating that the subject is exposed. As in Modules 1 and 2, we consider potential outcomes under both exposures.
Answered Same DaySep 07, 2021

Answer To: Module 3 Time-invariant exposures and propensity scores Module 3 objectives At the end of this...

Bezawada Arun answered on Sep 12 2021
132 Votes
## Using R version 4.0.2(i.e latest version)
## For reference on analysis visit https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3144483/#s11title
##Install packages (NB: install packages on
ce for every session)
#install.packages("twang")
install.packages("tableone")
install.packages("Matching")
install.packages("survey")
install.packages("reshape2")
install.packages("ggplot2")
## Load packages
library(twang)
library(tableone)
library(Matching)
library(survey)
library(reshape2)
library(ggplot2)
## Use data(lindner) to load the dataset for this analysis
# or
getwd()
setwd("/Users/HP/Downloads")# sets working directory
lindner <- read.csv("twang.csv",header=T)#load data including header
head(lindner)
any(is.na(lindner))# checks missing values
data.frame(lindner)
lindner$sixMonthSurvive <- as.numeric(lindner$sixMonthSurvive) #converts False or True to 0 0r 1 respectively
summary(lindner) # basic stats)
#tail(lindner)
##Comparing the balance of each confounder across abciximab exposure levels using standardised differences
## Covariates
covrs <- c("height","stent","female","diabetic","acutemi","ejecfrac","ves1proc")
## Construct unmatched table
tabUnmatched <- CreateTableOne(vars = covrs, strata = "abcix", data = lindner, test = FALSE)
## Show table with SMD
## NB both SD and SMD can be found in the table.
print(tabUnmatched, smd = TRUE)
# It was observeed that 6 out of the 7 covariates have standardized mean differences of greater than 0.1,
# indicates an important covariate imbalance,
addmargins(table(ExtractSmd(tabUnmatched) > 0.1))
## Now we estimate the propensity scores
## we fit the model
psModel.lindner <- glm(abcix ~ height + stent + female + diabetic +acutemi + ejecfrac + ves1proc,family = binomial(link = "logit"),data = lindner)
summary(psModel.lindner)
#plot(psModel.lindner)
## Predicted probability...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here