Proponents of the use of randomized controlled trials (RCTs) in impact evaluation and development research often point out the close link between these trials and their clinical counterparts in the world of medical research. Yet, clinical trials often differ from development RCTs in a number of ways, ranging from their ability to ensure subjects actually take their medicine to their emphasis on blind or double-blind protocol, where subjects are unaware whether or not they have received the real treatment. By contrast, development RCTs exist in a far messier world in which, for example, farmers cannot be forced to use the fertilizer you just randomly allocated them and preventing your study subjects from knowing they have received a bag of fertilizer is nigh impossible.
This has not stopped a group of researchers from trying to close the gap by implementing a double-blind protocol as part of a standard development impact evaluation. In the abstract of the resulting paper, Erwin Bulte and coauthors describe how introducing a blinding protocol seemed to eliminate the effectiveness of the intervention they were studying:
"Randomized controlled trials (RCTs) in the social sciences are typically not double-blind, so participants know they are “treated” and will adjust their behavior accordingly. Such effort responses complicate the assessment of impact. To gauge the potential magnitude of effort responses we implement a conventional RCT and double-blind trial in rural Tanzania, and randomly allocate modern and traditional cowpea seed varieties to a sample of farmers. Effort responses can be quantitatively important—for our case they explain the entire “treatment effect on the treated” as measured in a conventional economic RCT. Specifically, harvests are the same for people who know they received the modern seeds and for people who did not know what type of seeds they got; however, people who knew they had received the traditional seeds did much worse. Importantly, we also find that most of the behavioral response is unobserved by the analyst, or at least not readily captured using coarse, standard controls."
So it appears that the impact of the treatment observed in the non-blinded RCT was being driven solely by changing behavior of the farmers who knowingly-received modern seed varieties. A common interpretation of this result is that RCTs which do not use blinding are somehow biased or are inflating impacts. The journalist Franciso Toro, for example, recently ran across the paper and touted it as some massive blow to the `randomista’ movement:
"This gap between the results of the open and the double-blind RCTs raises deeply troubling questions for the whole field. If, as Bulte et al. surmise, virtually the entire performance boost arises from knowing you’re participating in a trial, believing you may be using a better input, and working harder as a result, then all kinds of RCT results we’ve taken as valid come to look very shaky indeed."
"Still, the study is an instant landmark: a gauntlet thrown down in front of the large and growing RCT-Industrial Complex. At the very least, it casts serious doubt on the automatic presumption of internal validity that has long attached to open RCTs. And without that presumption, what’s left, really?"
This left me wondering: should we actually be blinding more often? Part of my work involves running a randomized trial of of land titling in Dar es Salaam - should I be worried that landowners there know whether or not they received a title? Upon further reflection I realized that, even we took the results of this paper at face value (and there are some good reasons we shouldn’t), it’s hard to see why these results should be so troubling.
Economics Ain't Medicine
The reason that medical researchers use double-blind protocol in clinical trials is to help pin down the exact physiological impact of a medicine, independent of any conscious or subconscious behavioral response by the study group. Placebo effects have been fairly well established, so figuring out that a given medicine has an effect above and beyond the health effects created by taking a sugar pill is important.
Most researchers running development RCTs are answering substantially different questions than medical scientists. It is fairly easy to establish the efficacy of a set of agricultural inputs in a controlled setting: we know fertilizer `works’ in that it improves yields. We know vaccines work in savings lives and that increasing educational inputs, to some extent, can improve educational outcomes. This was Jeffrey Sachs’s reasoning when he sold much of the world on the Millennium Village Project: we have already scientifically proven what works, we just need to implement it.
But most of us running RCTs are not interested in the direct impact of an intervention, holding behavior constant, because it is precisely this behavior that matters the most. If our question is “do improved seeds work in a controlled setting?” then a double-blind RCT is well and fine, but if our question is, “do improved seeds work when you distribute them openly, as you would do in pretty much any standard intervention,” then you need transparent protocols to get at the average treatment effect you are interested in.
In addition, many research economists are interested in mechanisms – in picking apart the behavioral responses to a given treatment. In this respect, the Bulte et. al. paper is very interesting: here we have an intervention which works primarily through behavioral response rather than a change in, say, household resources. This is intriguing and worth picking apart to get a better sense of why interventions like these work. However, from the perspective of a policy wonk, we might care less about the whys: if you give people improved seeds then yields go up. If you de-worm children then schooling goes up. These are answers worth knowing even if that’s all we know.
Blind RCTs Might Come Up Short on Ethics AND Effectiveness
For those of us interested in behavioral responses, we don’t necessarily need to dash around running double-blind RCTs to get a handle on them. Consider this excellent paper by Jishnu Das and co-authors on the effect of anticipated versus unanticipated school grants: when parents knew that their child’s school would be receiving more money, they reduced their own spending on school inputs enough to completely offset the gains from the grants. In a world in which we could have run the grant program as a blinded RCT, it would have appeared that grants were successfully in raising test scores – but we would have learned precious little about how grants operate in the real world.
There is another issue here: imposing blinding in many development RCTs creates some substantial ethical issues. Imagine, for instance, that you could fool a Kenyan farmer into not knowing whether or not she received high quality fertilizer or a bag of dirt. The average farmer might behave as if she has received nothing, she might also behave as if she had received a perfectly good bag of fertilizer, or she might hedge and use some of it, realizing that it may not be useful. Some of these decisions may be sub-optimal: if the farmer knew she was in the control group, she might have opted for a different planting method, one which could have resulted in a higher yield. In this particular example, obscuring the treatment from our study group actually runs the risk of doing them harm, especially if they believe they are treated and take complementary actions which are in fact wasteful if they are actually in the control group.
What we should be taking away from the Bulte et. al. study should not be “all RCTs are biased because we aren’t measuring placebo effects” but instead “behavioral response matters for evaluating real-world policies.” The latter statement actually reinforces the need to have transparent RCTs, rather than to try and mimic the double-blind nature of clinical trials.