Worms in Kenya [35 points] The dataset, ted miguel worms.dta is from Miguel and Kremer’s deworming project in Kenya. Please read the paper before answering these questions. It is better that you get...

Worms in Kenya [35 points] The dataset, ted miguel worms.dta is from Miguel and Kremer’s deworming project in Kenya. Please read the paper before answering these questions. It is better that you get familiar with the experiment before answering the questions. You can find the paper on Quercus. Miguel and Kremer randomize over schools (not individuals) and introduce deworming drugs to a randomly selected treatment group to estimate the effect of deworming on school attendance. 1. First, why randomize at the school level? Think what issues might arise in your evaluation if you randomize at the individual level. [5 points] 2. Suppose you had pre- and post- treatment attendance records for all schools, describe the calculations/comparisons you could do to estimate the effect of deworming on school attendance (in words and with a DD grid). Why might you prefer this DD estimate to a straightforward OLS estimate using only post-treatment data? [5 points] 3. Write down the regression you want to run to estimate the DD you described in question 2. Explain the meaning of the interaction term. [5 points] Now look at the data set. As before, you are only required to hand in your documented do-files or R-scripts along with the answers. You can also include relevant parts of your .log file. • totpar98 is school participation in 1998, during the first year of treatment (they do not have good pre-treatment data - it is from attendance registers only). • pill98 is a dummy variable for whether or not the student took a pill. • treat sch98 is a dummy variable equal to one if the school received the treatment in 1998. • infect early99 is the health outcome in early 1999 when only Group 1 schools have received deworming (Group 2 schools got them right after this data was collected) Use the describe command to see the meaning of other variables. The authors have pre-treatment health, but only for the Group 1 kids —it was considered unethical to get health information for control group kids— and thus this data cannot be used in regressions. So we cannot run the DD proposed above 4. How many observations are there per pupil? What percentage of the pupils are boys? What percentage of pupils took the deworming pill in 98 and what percentage took it in 99? What percentage of schools was assigned deworming in 98? Is this more or less than the percentage of pupils who actually took the pill in 98? [5 points] 5. What does the mean of a dummy variable tell us? Give one example from the dataset. What are two advantages of using dummy variables? [3 points] 6. Using the data, find the difference in outcomes (Y: school participation) between (a) Students that took the pill and students which did not in 1998. This is E [Y |X = 1] − E [Y |X = 0] Is this a good estimate of the effect of taking the pill on school attendance? [3 points] 3 (b) Students in treatment schools versus students not in treatment schools in 1998 (regardless of whether they actually took the pill). [3 points] E [Y |Z = 1] − E [Y |Z = 0] (c) Using the data, calculate the difference in the probability of taking the pill given that a student was in a treatment school and the probability of taking it if a student was not in a treatment school. [3 points] E [X|Z = 1] − E [X|Z = 0] (d) Derive the Wald Estimator and explain the meaning of this result. What is the advantage of the Wald Estimator over what calculated in 6a? [3 points] 4 Estimating Spillover Effects [20 points] For the next set of questions, consider the following scenario and the described potential outcomes framework. • Assume that a researcher conducts a two-stage randomization of an extra-curricular program designed to decrease bullying in schools. As the program is extra-curricular, only a subset of students in each school are treated and participate in the program. However, the researcher is interested in estimating the spillover effects on students who did not receive the program directly but have peers who received the treatment. To estimate this program’s direct impact and spillover effects, the researcher used a two-phase experimental design. In the first phase, schools are randomized to the treatment or a control group. In the second phase, students in the treated schools are randomized to participate in the extra-curricular program or not. • Define as Yi(di , si) the potential outcome for individual i under individual treatment status di and school treatment status si . For instance Yi(di = 1, si = 1) is the potential outcome for individual i of receiving the treatment and being enrolled in a treated school. Yi(di = 0, si = 1), on the other hand, is the potential outcome for individual i of not receiving the treatment but being enrolled in a treated school. • As usual the researcher only observes one of four potential outcomes for each individual, but can use the design to estimate ATT and spillover effects. For the next set of questions please select the correct option and provide an explanation. 1. Based on the experimental design, what is the outcome that the researcher never observes for any individual? [4 points] (a) Receiving the treatment in treated schools Yi(di = 1, si = 1). (b) Receiving the treatment in control schools Yi(di = 1, si = 0). (c) Not receiving the treatment in treated schools Yi(di = 0, si = 1). 4 (d) Not receiving the treatment in control schools Yi(di = 0, si = 0). 2. Based on this framework, how would the researcher estimate the direct average treatment effect (the comparison of those treated with pure control). [4 points] (a) E[Yi(di = 1, si = 1)|di = 1, si = 1] − E[Yi(di = 0, si = 0)|di = 0, si = 0] (b) E[Yi(di = 0, si = 1)|di = 0, si = 1] − E[Yi(di = 0, si = 0)|di = 0, si = 0] (c) E[Yi(di = 0, si = 1)|di = 0, si = 1] − E[Yi(di = 1, si = 0)|di = 1, si = 0] (d) E[Yi(di = 1, si = 1)|di = 1, si = 1] − E[Y i(di = 0, si = 1)|di = 0, si = 1] 3. Based on this framework, how would the researcher estimate the spillover effect of the treatment, (the impact on those not directly treated but whose peers receive the treatment effect). [4 points] (a) E[Yi(di = 1, si = 1)|di = 1, si = 1] − E[Yi(di = 0, si = 0)|di = 0, si = 0] (b) E[Yi(di = 0, si = 1)|di = 0, si = 1] − E[Yi(di = 0, si = 0)|di = 0, si = 0] (c) E[Yi(di = 0, si = 1)|di = 0, si = 1] − E[Yi(di = 1, si = 1)|di = 1, si = 1] (d) E[Yi(di = 1, si = 0)|di = 1, si = 0] − E[Yi(di = 0, si = 0)|di = 0, si = 0] 4. Now suppose that the treatment at the school level is random, but that students can decide whether to participate or not in the program. (a) Can the researcher estimate the treatment effect of being offered the program at the school level? Why or why not? [4 points] (b) Can the researcher estimate the causal spillover effect of the program —the impact of having peers who received the program? Why or why not? [4 points
Oct 23, 2021
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here