Impact Evaluation: How the Wonkiest Subject in the World Got Traction

March 14, 2016

This blog is part of a special series celebrating CGD’s 15th anniversary in 2016. All year, CGD experts will look back at work we’ve done that has had real-world impact, and forward to future research that we hope will help increase global prosperity.

“3ie has made my job much easier.” 

This is what we heard last month from a high-ranking government official in Africa, referring to the International Initiative for Impact Evaluation (3ie), and it made us very proud. Creating 3ie was the outcome of the Evaluation Gap Working Group that we led along with Nancy Birdsall to address the limited number of rigorous impact evaluation of public policies in developing countries. As CGD celebrates its 15th year, it is worth considering what made that working group so successful, the obstacles we confronted, and the work that still remains to be done.

In the early 2000s, we decided to tackle a long standing problem: there were simply too few good quality impact evaluations being conducted to ascertain whether development projects were achieving their goals and to help developing countries learn which interventions are effective and how to improve them. We wanted to know why governments and agencies underinvested in rigorous studies so that we might develop a practical solution to this problem. With the support of the Gates and Hewlett Foundations, we convened a working group to undertake this task.

When CGD convenes working groups, it tries to get people with different perspectives and backgrounds. The Evaluation Gap Working Group members followed this model: some people were strong supporters of particular evaluation methods, while others were skeptical of the demand for, let alone the usefulness of, such studies. Through research findings, interviews, and consultations in different parts of the world, we gradually developed a consensus report on the need for more and better impact evaluations, the reasons behind the lack of investment in evaluation within the development community, and the potential solutions corresponding to those reasons:  strengthening the evaluation practices of major funders and creating an international organization to fund and promote them. It took another two years to facilitate the process that culminated in the creation of 3ie, and longer still for large bilateral agencies, like the UK Department for International Development and the US Agency for International Development, to create and implement policies that incorporated attention to impact evaluation.

The Working Group process was not without its difficulties. We worried that the topic would fail to garner attention. One of our colleagues told us the issue was simply too “wonky.” Instead, it kicked off a firestorm among bilateral evaluation departments and evaluation associations. The negative reaction suggested to us a kind of “Emperor’s Clothes” story – most agencies and evaluation experts knew that their evaluation studies weren’t up to the task of assessing impact but they were loath to address it.  There are likely many reasons for this, ranging from professional pride to sincere concerns about the ethics, feasibility and utility of what some perceived as an approach too sophisticated or inappropriate for the context of development programs.

In parallel, researchers (primarily economists) were expanding the use of Randomized Control Trials (RCTs) to assess project interventions and our initiative got tied up in long-standing methodological debates over the applicability of RCTs to social analysis. As much as we argued that our initiative was about rigor and not a particular method, it continued to be branded as an RCT crusade – and, admittedly, there were some on the working group who sought to make it that. However, even though the final report argued that RCTs held great promise, it called for convening an international group to develop standards and stated that “[t]he starting point is defining the policy question and the context. From there it becomes possible to choose the best method for collecting and analyzing data and drawing valid inferences.”  

By raising the question of why more impact evaluations were not being conducted –  why so few development programs even had baseline data – the working group process directly shook up established interests. Evaluation among bilateral agencies was (and still is) focused more on processes, operations, and strategies than impact. The world of professional evaluators has a lot to offer in these kinds of studies but has relatively fewer experts in constructing the plausible counterfactuals – either statistically or qualitatively – that are needed to assess impact. The World Bank delayed publication of our report by a full year by offering numerous critiques which, once written down, were at best confusing. The reasons for this opposition remain unclear to us, but were probably related to the World Bank’s own interest in getting funds for its impact evaluation efforts like DIME and SIEF. It remains a shame that the World Bank is not providing funding to 3ie and other collective efforts to promote impact evaluation in developing countries. The World Bank is probably the only international organization with the scale and capacity to pursue its own rigorous evaluation program, and perhaps because of that go-it-alone tendency has not fully engaged in the community of practice around impact evaluation. 

In retrospect, it’s clear that the working group process and results gradually won respect and shifted some of the terms of the debate.  The working group articulated a general recognition that more impact evaluation was required and that the rigor of evaluations needed to be addressed. It persuaded some key people that evaluating important questions about the impact of aid at the country or sector level still needed to be informed by evidence on local and specific interventions. It demonstrated that a dozen or so developing countries had an interest in institutionalizing impact evaluation as part of their domestic policy process. And last but not least, it culminated in creating an international organization, 3ie, to channel funds and build a community of practice around more and better impact evaluations.

Impact evaluations have clearly increased in numbers and quality over the last 10 years. CGD’s Evaluation Gap Working Group and the creation of 3ie were not the sole cause of this, but we think it plausible to say that the initiative contributed and may have even accelerated this process by bringing a new organization and more funding into the field.  In fact, though the controversy and misunderstanding was frustrating, it probably garnered the subject matter more attention than it would have gotten if we hadn’t ruffled feathers. 

The Evaluation Gap initiative, one of the early CGD working groups, was in equal parts eclectic and directed, focused on a problem and keen to get to a solution, albeit not a predetermined one.  Many CGD working groups follow this approach: identifying a problem, posing a provocative question that invites different ways of thinking, convening people with different perspectives, and informing the process with good research. When the working group’s final reports and recommendations are issued, CGD follows through by direct engagement with organizations that could carry the ideas forward, as facilitators, and sometimes incubators. 

The evaluation gap is closing even if it isn’t closed. The key finding of our 2006 report was the institutional bias against investing in rigorous studies of impact. 3ie is heavily dependent on a few committed funders when it should be financed by long-term financial commitments from all countries and multilateral agencies engaged in policy and programs. We have argued that a collective commitment to dedicate 0.1% of annual disbursements to 3ie – or some similar international fund for rigorous impact evaluation – would be make a big difference to improving the effectiveness of development aid. In fact, we argue that collectively financing the creation of knowledge should actually be the future of aid. We simply cannot continue to underinvest in the evidence base on public policy.

Finally, this is not simply a matter of more studies but also the growth of a community of practice that includes researchers, government officials and institutions. The African official we quoted at the beginning explained why 3ie was so important. “I can get things done in [in my country] because of the resources I can draw on from 3ie, and I stay motivated because I can turn to these international colleagues.” That is one key to global development.

Ruth Levine is now Director, Global Development and Population Program, Hewlett Foundation and was formerly a CGD Senior Fellow and Vice President for Programs and Operations.


CGD blog posts reflect the views of the authors, drawing on prior research and experience in their areas of expertise. CGD is a nonpartisan, independent organization and does not take institutional positions.