Back in 2004 a major new development project started in Bar-Sauri, Kenya. This Millennium Village Project (MVP) seeks to break individual village clusters free from poverty with an intense, combined aid package for agriculture, education, health, and infrastructure. The United Nations and Columbia University began the pilot phase in Bar-Sauri and have extended it to numerous village clusters in nine other countries. They hope to scale up the approach across much of Africa.But wait: Before we consider blanketing a continent with any aid intervention, we have to know whether or not it works. For example, we have to know if different things have happened in Bar-Sauri than have happened in nearby Uranga, which was not touched by the project. And we have to know if those differences will last. This matters because aid money is scarce, and the tens of millions slated for the MVP are tens of millions that won’t be spent on other efforts.Here I discuss a new research paper that I wrote with Gabriel Demombynes of the World Bank. We ask when it’s important to take great care in measuring a project’s impacts, and we illustrate one concrete case: the Millennium Village Project. We show how easy it can be to get the wrong idea about the project’s impacts when careful, scientific impact evaluation methods are not used. And we detail how the impact evaluation could be done better, at low cost.Update: Over at the World Bank’s Africa blog, Gabriel talks more about the paper and about a fascinating field visit he made after we finished the study.The paper deliberately makes no conclusion about the wisdom or effectiveness of the intervention itself, except for the modest claim that the intervention shouldn’t be massively scaled up until its effects have been reliably estimated. To me that’s uncontroversial; Africans have urgent needs, but they have urgent needs for things that work, and many of them have been disappointed by well-intended outsiders in the past.
It’s all relativeOur first point is about comparing trends inside the Millennium Villages to trends outside them. The June 2010 MVP midterm evaluation shows you positive trends within the Millennium Villages, in development indicators like access to sanitation, water, and cell phones. The project claims responsibility for those changes, by calling them “impacts” of the project. But those trends don’t tell you the impact of the project, because it’s possible that some or all of those changes would have happened if the project hadn’t happened.One way to approximate what might have happened without the MVP is to look at trends in the same indicators across the regions around the Millennium Villages, and across the whole country. For example, below are trends in the fraction of small children sleeping under Insecticide-Treated Nets (ITNs) that help prevent malaria. The black line shows the Millennium Village of Bar-Sauri, Kenya. The blue line shows the trend in rural Nyanza Province. (Bar-Sauri comprises just 1.3% of the rural population of Nyanza.) Green shows rural Kenya, and red shows all of Kenya:
- Subjective choice of intervention sites. The initial wave of Millennium Villages was chosen in part because they had special traits, such as committed NGO partners and local leaders, that would help the project work well. This is a perfectly reasonable choice from a project management point of view. But this means that it’s unclear if the intervention would work as well at other sites. The MVP evaluation protocol mentions this issue.
- Subjective choice of comparison sites. In the future the MVP plans to release data on comparison villages that did not get the intervention. But the evaluation protocol doesn’t give you a way to know whether or not the comparison sites are truly just like the intervention sites in every way that could affect project success—such as (again) the commitment of local leaders. I discussed this problem back in April.
- Lack of baseline data on comparison sites. Unfortunately, though the project now includes comparison villages, no data were gathered on conditions in those villages at the time the project began (at “baseline”). This may be because having comparison villages at all is a recent change in the project, which stated as recently as three years ago that it considered collection of data in comparison villages to be unethical. Now there are comparison villages, but the lack of baseline data means that it will be harder to tell if any differences between the intervention sites and the comparison villages existed before the project started. The MVP evaluation protocol notes this weakness too.
- Small sample size. Just ten village clusters are involved in the project. The MVP evaluation protocol notes that this small sample size will only be able to reliably detect a very large minimum effect of the project on child mortality: a 40% drop in five years. That is a giant effect, more than twice as fast as the decline in child mortality called for by the ambitious Millennium Development Goals. Any smaller effects might be difficult to scientifically distinguish from zero.
- Short time horizon. The current MVP evaluation protocol only contains plans to evaluate the impact of the MVP over a five-year timespan. This is inadequate, because past village-level package interventions in poor rural areas have had short-term effects that rapidly dissipated after ten years. The goal of the project is to create “self-sustaining economic growth” in the villages, and until it’s known whether or not the project is capable of that, the project shouldn’t be greatly scaled up.
In our paper, we detail exactly how a proper impact evaluation could be done, at a cost per village not much higher than the cost of the current evaluation, and in a way that remedies all five of the above weaknesses. The proposal is for random assignment of treatment between about 20 matched pairs of village clusters, where both members of each pair are monitored over 15 years.The paper has two ultimate purposes. One is to highlight the need in general for rigorous impact evaluation when it is feasible. The second is to argue for such an evaluation of the MVP going forward. It’s too late for the current crop of MVP sites. But there is no obstacle to undertaking a rigorous evaluation for the next 20 sites. There are plans for far more than that number.
Topics
DISCLAIMER & PERMISSIONS
CGD's publications reflect the views of the authors, drawing on prior research and experience in their areas of expertise. CGD is a nonpartisan, independent organization and does not take institutional positions. You may use and disseminate CGD's publications under these conditions.