Impact Evaluation is Critical for Building Knowledge

Unfortunately, evaluations that are done improperly are misleading. They purport to reach conclusions that are actually unsubstantiated. This means the risk of wasting public resources or even harming participants is real. It is for this very reason that clinical trials of medications have become a standard and integral part of medical care. No one would consider prescribing strong medications without properly evaluating their impact or potential side effects. Yet, in social development programs, where huge sums of money can be spent to modify population behaviors, change economic livelihoods, and potentially alter cultures or family structure, no such standard has been adopted. While it is widely recognized that withholding programs that are known to be beneficial would be unethical, the implicit corollary – namely that programs of unknown impact should not be widely replicated without proper controlled evaluation – is frequently dismissed.

Good studies avoid costly mistakes

Findings from impact evaluations can help avoid costly mistakes. For example, an Indian NGO (Seva Mandir) decided to hire a second teacher for their non-formal education centers in the hopes that it would increase attendance and attainment. Twenty-one of the NGO’s 42 centers were randomly selected to receive a second teacher. Intermediate indicators – such as the number of days the school was in session – did improve; however, test scores remained the same. The NGO was able to see that the benefits of the two-teacher initiative were not justified by the cost and redirected its funds to expand other more promising programs (See Duflo 2003).

The value of impact evaluations in avoiding costly mistakes takes on particularly urgency for programs that are scaled up to a national level. For example, in the US, a program entitled Drug Abuse Resistance Education (DARE) had been adopted in 75% of US school districts because it was believed to be effective; however, evaluations with random assignment demonstrated that the program was ineffective, thereby wasting financial resources and school time (Lynam et al 1999, Rosenbaum and Hanson 1998). Similarly, a review of 10 randomized control studies on the policy of “tracking” students, i.e. grouping them by skill levels, showed this approach has little or no effect on student achievement (Mosteller et al 1996). Despite the lack of evidence, skill grouping continues to be the most common basis for organizing classes in US middle and high schools.

Good studies can distinguish successes even under adverse circumstances

For those convinced of the efficacy of their programs, money spent on demonstrating impact through comparisons of participants and non-participants often seem unnecessary. However, without such comparisons, beneficial programs that mitigate negative trends might be mistakenly viewed as failures. For example, numerous programs to prevent the spread of HIV/AIDS are being financed around the world but the best they can hope for in the short run is to slow the rate at which prevalence is increasing. Therefore, unless the programs can demonstrate that the rate at which the disease has spread in their target group is lower than in other appropriately controlled groups, they will look like failures.

This ability to distinguish a successful program under adverse circumstances was demonstrated clearly with a US Department of Labor Summer Training and Education program. A random assignment study found that disadvantaged teens lost half a grade in reading ability – apparently a complete failure. However, non-participants lost a full grade of reading ability. The evaluation demonstrated that the program mitigated the loss of reading ability that naturally occurred during the summer vacation months (Grossman 1994).

Good studies identify real successes

Unfortunately, it is all too common to find poor evaluations that claim to show a program’s positive impact when, in fact, the positive results are due to something other than the program. For example, retrospective studies in Kenya can demonstrate that providing audiovisual aids improved student test scores. But this finding is apparently the result of some unobserved factors because more rigorous randomized assignment studies demonstrate that there is little or no effect (Glewwe et al 2004). Similarly, a US program aimed at assisting poor families through social service visits demonstrated that those receiving the program experienced improvements in family welfare – but so did the families who were randomly assigned to a control group that did not receive the visits (St. Pierre and Layzer 1999). In both cases, a good study helps avoid spending funds on ineffective programs and redirects attention to more promising alternatives.