The most essential feature of a social impact bond (SIB) is measuring impact. But what happens if the impact metric is questioned or unclear? A recent dispute over measuring the impact of a SIB for early childhood development in Utah yields two important practical lessons for this innovative financing tool. First, SIB implementers should be careful not to exaggerate the precision of their success indicators. Second, they need to be clear to everyone about which objectives they are pursuing.

This particular story unfolded when investors declared the success of a SIB aimed to prepare 3- and 4-year-olds in Utah for kindergarten. Multiple press releases and articles reported the promising results of the preschool program, financed by Goldman Sachs and the Pritzker Family Foundation, with United Way of Salt Lake serving as the intermediary. At CGD, we even featured the program in our Cash on Delivery Update. Within weeks, however, critics were questioning the metrics used to declare success.

At that point, we considered putting out a correction to the newsletter. But after digging deeper, we found that the story is more complex, especially when it comes to choosing the right indicator and assessing the program’s objectives. (A view shared by Kenneth Dodge in this New York Times essay).

Is the success metric good enough?

When the program started, 110 children were identified to be “at risk,” or likely to need state-funded special education services. After attending the preschool program, however, only one student ended up using these services, presumably saving taxpayers a lot of money.

The program used the Peabody Picture Vocabulary Test (PPVT) to identify preschool-age children who might need special education services in later years. Experts predicted that about one-third of those scoring below 70 on the test would probably need such services during their school years, costing the school district $33,185 per child.

But critics charged that this method overstates the program’s success and the payoff to investors. Some of these 110 children might have improved during the year even without the pre-school program. Without a counterfactual, the program cannot know if all 110 students who scored below the PPVT threshold score of 70 would have ended up in special education without the preschool program.

In some “pay for results” programs, a very precise indicator is required. But in others, a less water-tight indicator can be justified. We have numerous rigorous studies that show good preschool programs improve learning and reduce the need for special education. So as long as pre- and post-tests in the Utah program show progress, it isn’t unreasonable to assume the program is working and generating cost-savings.

The question at this stage is not whether the impact bond’s success is being overstated, but whether the measure is good enough to track progress and calculate payments. Without other provisions, the overstatement could lead to unreasonable returns, but in Utah’s case, the return to investors is capped at 5 percent over the municipal borrowing rate. That is the most they can make. On the other hand, if more than half of the children require special education services, the investors lose what they put in.

This is one way SIBs differ from traditional ways of financing public services. Based on evidence, Utah would be justified in simply expanding preschool programs to all children who score low on the PPVT. The advantage of the SIB, however, is that the upfront funding is provided by investors and that the program’s results are tracked annually — with financial consequences. This generates a strong feedback loop typically lacking for many social services.

Is the program objective appropriate for a SIB?

The program’s objective also affects whether the PPVT is good enough for measuring success. The SIB’s designers noted that the program can be designed with two objectives: cost avoidance associated with reduced demand for special education services and outcome improvements associated with better learning and social integration.

From press accounts, it isn’t entirely clear whether the SIB was primarily designed to help the state save on special education services (cost savings) or to find money for an underfunded but successful education program (outcome improvement). In this case, the independent evaluator (Utah State University), estimated the cost savings from the first cohort of students not utilizing special education services at $281,550. But even if the metric overstated the cost savings, 110 children received a preschool program that would not have been otherwise offered by the state and did so while putting private, not public, money at risk for impact.

Utah’s experience is a good opportunity to recognize how much we know about choosing good indicators for SIBs and other “pay for success” programs. It also shows the importance of clearly explaining to decision makers and the public why an indicator is “good enough,” how much upside and downside risk is associated with measurement error, and what the true objectives of the program are. Once we have more experience with SIBs, we may even develop standard reporting formats. Until then, we (and the media) need to ask good questions and be more reflective before we trumpet or condemn a particular SIB.