The use of impact bonds, and results-based financing (RBF) more generally, continues to grow. One hundred and sixty-five impact bonds have been contracted to date, and at least four outcomes funds were launched in the past three years which aim to raise billions of dollars for RBF. However, there is an ongoing debate about whether RBF mechanisms, like impact bonds, fulfill their promises. Before we move forward with investing billions in RBF models, it is our responsibility to ensure that this is the most effective approach to achieving social impact. Outcomes funds could perform a great public service by earmarking a small percentage of funds for research to test whether RBF actually works as intended and is worth the trouble.
Proponents of impact bonds have made many claims about how they can improve the way the development sector operates, for example, by incentivizing greater impact, drawing in new sources of funding, and spurring innovation. While there is suggestive evidence to support some of these claims, such as shifting focus from inputs to outcomes and improved performance management, we still have a huge amount to learn about whether and when these instruments actually do enhance impact. On the flip side, detractors claim that impact bonds are too complex, risk creating perverse incentives, have high transaction costs, and are not attractive to private investors.
We acknowledge both the (potential) benefits of impact bonds as well as the (very real) drawbacks. IDinsight’s experience evaluating the world’s first development impact bond (DIB) in education and Africa’s first development impact bond gives us reason to believe that impact bonds could spur programs to greater impact and provide valuable data on where course corrections are necessary. Despite getting off to a slow start, the implementer of the education DIB, Educate Girls, achieved huge gains in the final year and greatly exceeded the impact bond’s targets.
But these results only tell us that Educate Girls’ program was a success. They don’t tell us whether the impact bond was the reason. With only one data point–the Educate Girls DIB–we could argue in both directions. We could claim that without the impact bond, Educate Girls might have continued on the trajectory they were on in the first two years and ended up at 78 percent of targets. If that’s true, impact bonds are great–the model would have accounted for over half the gains achieved by the end of the program! But...we don’t know if that’s true. One could also argue that Educate Girls would have achieved these results without an impact bond. Perhaps the only critical components were the targets and incentives, and these could be replicated in a typical grant without the added complexities of an impact bond. Or perhaps Educate Girls simply needed time to work out implementation challenges inherent to any program in a new environment, especially one rolled out to 166 schools all at once. If so, the money and time spent on the impact bond would have been a significant waste.
Figure 1. Actual and projected year three learning gains
At the end of Y3, EG achieved 160% of the learning gains target. Had they continued on the same tred as Y1 and Y2, they would have only reached 78% of the target.
And of course, there’s the muddy middle. Maybe Educate Girls would have achieved 90 percent of the target, or 100 percent, or 125 percent without the DIB structure. Each of those outcomes has different implications for whether the DIB was ultimately “worth it.” In this case, we can’t answer this question because we don’t know the counterfactual–what if Educate Girls had just gotten a traditional grant instead of structuring it as an impact bond?
The good news is that we don’t have to guess at the answers to these questions. We can and should be investigating them empirically–the same way we carefully evaluate whether the programs within impact bonds are successful in meeting goals. Below, we’ve outlined some possible approaches to evaluating the impact bond model (versus evaluating the programs they fund). These are listed from most to least rigorous, but each has its own tradeoffs. The approach you choose should follow from the opportunities and constraints of the program contexts. Because of their large scale, we believe outcomes funds are uniquely positioned to draw on these strategies to produce a holistic picture of the value of impact bonds.
1) Impact evaluation with multiple service providers
Outcomes funds have the advantage of working with multiple service providers at once. Although payments are usually made to service providers through impact bond contracts, a portion of the pooled funds could pay some service providers upfront through traditional grants –not impact bonds. Outcomes could then be compared for impact bond versus “non-impact bond” service providers. Service providers could be randomly assigned to receive an impact bond or traditional grant, or a comparison group of service providers that receive traditional grants could be constructed through matching methods.
2) Impact evaluation with one service provider
A similar study could be designed with just one service provider. In the Educate Girls example, we were already tracking a treatment and pure control group as part of the evaluation. But we could have added a third group: a second treatment group that received Educate Girls’ normal program, but received funding through a traditional grant. Results from those schools would have helped us understand if the impact bond treatment results were just business as usual for Educate Girls, or the result of the impact bond itself. Similar to the first option, the “non-impact bond” treatment group could be randomly selected or matched using quasi-experimental methods, depending on program constraints. The World Bank’s Health Results Innovation Trust Fund and Allegri et al. (2019) have already evaluated results-based contracts versus other types of funding, however, they have not focused specifically on impact bonds.
3) Process evaluation
If you have an idea of how you expect the impact bond structure to add value, you could sketch out a theory of change and conduct a process evaluation to investigate whether things are happening as you expect. Educate Girls reported making several important changes in response to results from the first two years of the DIB evaluation, including targeting absent students with home visits, administering more rapid assessments in schools, and improving internal performance management systems. With additional funding, IDinsight could have investigated this more systematically and rigorously by documenting those types of changes in real time and checking whether they supported our theory for how an impact bond could create positive changes. Ecorys is currently conducting a similar study across a number of DIBs to assess the extent to which hypothesized benefits of impact bonds are realized in practice.
4) Benchmark to similar interventions
You could also benchmark impact bond performance against other providers operating in similar circumstances. IDinsight did this informally on the Educate Girls DIB by comparing the results achieved to cross-program comparisons put together by J-PAL. Future projects could do this in a more structured way by doing a qualitative comparison between specific programs that received different types of funding.
We want to be clear: Evaluating these questions takes resources. But considering that this evidence could inform whether billions of dollars should be flowing into DIBs and other forms of RBF, it could be a very valuable investment.
In conclusion, impact bonds are promising, and it’s exciting that there’s suggestive evidence that they’re delivering on at least some of the big claims that have been made about them. But skepticism at this stage is healthy too. We don’t think anyone should take it on faith that impact bonds are always–or even usually–an improvement on typical grants. In keeping with IDinsight and CGD’s focus on evidence, we think this is an important question for these newly-launched outcome funds to carefully evaluate.
 Of course, there is more than one “what if” to consider – Educate Girls could have gotten funding in a simpler form of RBF, such as a performance-based contract. But for simplicity’s sake, will just be discussing impact bonds vs. traditional grants in this article.
 By traditional grant, we mean funding not tied to results. You could construct different counterfactuals depending on the specific research questions. For example, if you want to test whether a DIB mechanism is necessary to improve performance management, you could create a treatment arm that uses a traditional grant plus performance management training.
Spillovers are a risk with this design. In the example provided, Educate Girls could have theoretically diverted resources from the non-DIB schools to DIB schools or improvements in performance management that resulted from the DIB could be applied to the non-DIB program. If this were the case, we could have blinded Educate Girls to which schools were in the non-DIB arm (they were already working in many schools outside of the DIB) and accompanied the impact evaluation with a process evaluation to document any spillovers.
 Students in Educate Girls’ program gained an additional 0.31 standard deviations in test scores over the course of the three-year evaluation. According to the evidence review conducted by J-PAL, an increase in test scores of greater than 0.3 standard deviations is considered a large effect.
 For example, we estimate that adding an extra arm to the Educate Girls DIB evaluation would have increased total evaluation costs by 35-50%.