The Scale-Up Effect: A Chapter-by-Chapter Round-Up

This post was written to accompany Scaling Programs Effectively: Two New Books on Potential Pitfalls and the Tools to Avoid Them, by David Evans

The book The Scale-Up Effect in Early Childhood and Public Policy: Why Interventions Lose Impact at Scale and What We Can Do About It has 25 contributions (22 chapters plus three other passages). Here’s a quick take on each:

Chapter 1: “Social scientists have delivered evidence of countless interventions that positively impact people’s lives. And yet, most programs, when expanded, have not delivered the dramatic societal impacts promised.” (Gupta et al.; open-access version)

Chapter 2: Brain development is extremely important in early childhood. We’re trying unconditional cash transfers for mothers on young children in the US! (Noble) The first study from the project described in this article is out now (Troller-Renfree et al. 2022).

Chapter 3: Three cognitive biases can “lead to scaling up programs that should not have been scaled up in the first place”: confirmation bias (government agencies or other organizations may selectively choose research that agrees with their existing beliefs), status quo bias (a policymaker may choose to scale a pilot program that isn’t so successful but where there is operational momentum or political pressure to scale as opposed to trying something new), and bandwagon bias (“widespread adoption of programs in education with weak or no basis in evidence other than the popularity of the program”) (Mayer, Shah, and Kalil).

Chapter 4: We often think of scale-up being a government decision, but most ECD interventions require parental involvement. There are tested ways of boosting take-up, from setting defaults for participation (i.e., opt out rather than opt in), to framing participation in ways that let families feel like they’re not being singled out or shamed, to helping parents update their (often incorrect) beliefs about how much time they invest in their children’s development (Gennetian – open-access).

Chapter 5: Look, it’s possible to scale the famous Jamaica program with strong results! Here’s evidence from China: “If it can be done there, it can be done elsewhere, and the effects on children around the world could be profound” (Zhou et al.). Although—and this isn’t in the chapter—a different implementation (perhaps less faithful to the Jamaica model) of home visits had much smaller effects in China (Luo et al.).

Case Study: A child home visiting program in Jamaica had big impacts later in life on both economic and other life outcomes. But when it was scaled up in Colombia and Peru, the effects were smaller. What’s up with that? (Sablich).

Chapter 6: There are four principal reasons we see reduced effect sizes at scale-up: false positives in pilots, the program changes at scale-up, the population reached by the program changes at scale-up, or the effects change when lots of people are getting the program (Al-Ubaydli et al.).

Chapter 7: Programs with a high likelihood of success are those with a high “post-study probability” (i.e., results that have been demonstrated on large samples, with theoretical support, preregistration of the study, and “successful exact replication”). Good descriptions of studies can help: “Without knowing what really was done, it is unlikely to be able to scale the intervention successfully” (Ioannidis et al.).

Chapter 8: Design pilot evaluations to mimic the real world conditions of the program, moving them along the spectrum from exploratory to pragmatic. Lots of good ideas, including the recruitment the workers who will implement the program: “document the order in which the program would like to hire workers and then randomly select workers from the larger set that would need to be hired if operating at larger scale” (Davis et al.).

Chapter 9: When you scale up a program, people often don’t stick to the program as designed (i.e., lower fidelity). There’s even a “tendency for trained practitioners to show declining fidelity over time, resulting in poorer intervention outcomes.” Supervision and training can help (Caron et al.).

Chapter 10: “Since exposure to treated individuals typically increases with scale, programs that are successful in small doses may appear less successful when scaled up because positive spillovers reduce the differences between the treatment and control groups” (Momeni and Tannenbaum).

Commentary: Partnerships are crucial for effective scale-up. “Concerted front-end efforts to build awareness among agency staff about the program can develop a base of knowledge needed to secure support. These efforts can grow challenging over time, as organizations experience leadership and administration transitions” (Pappas).

Chapter 11: All the factors you saw in chapter 6 play a role in the drop in effect sizes going from an effective program with about 70 children in Jamaica to a program with 700 children in Colombia to a program with 70,000 children in Peru. Political pressures can exacerbate challenges: in Peru, “political pressures led to very rapid expansion targets (often at the cost of quality) or to suboptimal resource-allocation decisions (for example, prioritizing the distribution of staff uniforms over that of manuals and toys)” (Araujo et al.; open access version).

Chapter 12: The Yale Research Initiative on Innovation and Scale is trying to figure out what works at scale. One spillover mentioned here that isn’t earlier in the book is: “does the scale-up of a popular intervention reduce political accountability and government performance by allowing low-quality incumbents to take credit?” (Mobarak and Davis).

Chapter 13: “Resist the temptation to default to technology solutions... Even though technology is often considered a pathway to flexibility, it can add considerable bottlenecks for design, massive costs for development, and—for some populations (e.g., elderly populations; global populations that have cellular phones, but not advanced smartphones)—more limited reach.” Also, “although studies indicate that university researchers are a relatively non-influential group in determining policy makers’ behavior…, creating concise and accessible research products (e.g., digestible briefs) tailored to policy makers’ needs reflects one useful strategy for bridging this gap” (Lyon).

Chapter 14: Make sure to select pilot participants in such a way as to reflect the population that would benefit if the program were scaled (Stuart).

Chapter 15: “Evidence-based medicine (and the more general ‘evidence-based practice’) … is the combination of research evidence, practitioner experience, and patient preferences. The majority of people considering the implementation of evidence-based interventions, however, prioritize research evidence above all other factors. This, along with the additional assumptions described below, significantly impacts—and often hinders—the ability to effectively implement, scale up, and sustain any intervention.” You’ve heard of scale-up, but what about scale down? “many health interventions being delivered are ineffective and even harmful (Cassel & Guest, 2012) and that there is an increasing need to understand how to de-implement those practices, or replace them with newer, more effective interventions” (Chambers and Norton).

Chapter 16: Use measures of impact that you can scale up: i.e., don’t measure child development one way in pilot and another way in scale-up and expect the same results (McConnell and Goldstein).

Commentary: “An even more important finding from this work includes the difficulty and extreme effort involved in collecting cost data” (Barofsky et al.).

Chapter 17: There’s often a focus on the delivery system. But don’t forget background work that makes sure the delivery system has the funding and other resources it needs. “There may be a tendency to underestimate the complexity involved in collaboration to implement evidence-based interventions when working with multiple stakeholders … and across multiple levels” (Brodowski and Naoom).

Chapter 18: If you want to maintain the fidelity of the program, you’re going to have to invest in continual development of your staff (Pacchiano et al.).

Chapter 19: Four lessons for “forging partnerships for scale”: (1) Build long-term partnerships. They foster trust and make it easier to identify policy windows to respond to quickly. (2) Learn from all kinds of data in the scale-up: rigorous impact evaluation but also descriptive data and reports from implementers. (3) Support a culture of evidence-based policymaking, in part by supporting “champions” of evidence with government, sometimes with external financing. (4) Use evidence-to-policy organizations to support government and also keep evidence prominent in dialogues (Carter et al.; open-access blog post).

Chapter 20: All that stuff above on having multiple replications of high quality is great, but “in the policy sphere … rigorous replication studies are not common and are often not feasible given the broad scale on which policies are generally implemented. Instead, policy makers must rely on information from the available body of evidence about the impacts of a policy on different outcomes, the strength of that evidence, and the degree to which those outcomes align with their own goals in implementing a policy.” Also, research more often tells us “what works” than important design details, like exactly how much of it do we need (Osborne).

Chapter 21: How do you go from a program in one state to a nationwide program? More than a decade of building evidence, fostering experts within the government, and putting supportive policies and funding into place. It takes a lot to get to scale (Young and Terra).

Chapter 22: Eight recommendations for researchers, six for policymakers, six for program leaders, and three for funders on supporting effective scale-up (Kane et al.; open-access version).