Can Better Test Scores Lift Kids Out of Poverty? The Case for Long-Term Tracking of Education RCTs

and

April 29, 2025

Existing evidence provides reason to be bullish on the long-term payoff to literacy and numeracy investments, but causal claims are often tenuous. For instance, we have a solid base of longitudinal studies linking primary school test scores to higher earnings in adulthood, but some obvious confounding factors remain: kids who go to better schools likely enjoy myriad other advantages along the way. Looking at the broader social benefits of girls’ schooling, and focusing on studies with a stronger claim to causality, a 2019 systematic review by the Population Council “found no evidence—not a single study—of the effects of literacy or numeracy on sexual and reproductive health” in low- and middle-income countries.

Nobody would deny that all kids deserve to learn to read and do basic math, at a bare minimum. But in a low-income context facing stark budgetary trade-offs, how much should governments and donors be willing to spend to improve test scores by, say, 0.2 standard deviations? $100 per pupil? $200? More? Or would that money be better spent on cash transfers, vaccines, and a host of other priorities?

To answer those questions, in 2023 we launched CGD’s Return to Learning Initiative to invest in long-term follow-ups to studies that can pin down the causal links. Our criteria were:

Randomized trials from a developing country setting…
…of programs that focused narrowly on learning outcomes in primary school, rather than multifaceted programs that invested in nutrition or health,
…which successfully raised test scores
…with participants who are now old enough to be in the labor market and starting families
…and could be successfully tracked today.

After compiling a long list of candidate studies, in 2024 we set out to test the feasibility of that last bullet point – collaborating with research teams in Mali, Liberia, Ghana, and Uganda to track participants from literacy interventions 12-15 years ago.

This post lays out what we’ve learned so far, and what we’re hoping to do next.

The proliferation of education RCTs in developing countries in the 2000s has created a new opportunity to study long-term impacts

By the mid-2010s, more than a dozen randomised control trials (RCTs) had tested interventions aimed at improving literacy and numeracy in low- and middle-income countries (LMICs). Many of the students who participated in these trials are now adults in the workforce. If they can be successfully located today—a big if—they could help answer questions with obvious policy importance, such as: if you learn more in second grade, do you earn more later?

From nearly 40 evaluations, we identified four pilot studies (Table 1). In addition to the criteria above, we looked for studies with big enough sample sizes to help us detect long-term effects, and where the original principal investigators (PIs) and institutional review boards (IRBs) were able to provide access to data and tracking information.

We tested two study designs: longitudinal studies that tracked the same individuals over time, and cross-sectional studies that tested different cohorts at each stage. Since around 40 percent of the evaluations in our database are cross-sectional, devising methods to track outcomes in each design was an important goal.

Table 1. Pilot studies for long-term tracking

#	Citation	Country	Program dates	Schools	Learning effect	Age, 2024	Kids tracked?
1	Lucas, McEwan, Ngware, & Oketch, 2014	Uganda	2009-11	109	0.18 sd	21.0	Yes
2	Piper & Korda, 2011	Liberia	2008-10	116	0.80 sd	24.5	No
3	Duflo, Kiessel, & Lucas, 2024	Ghana	2010-13	300	0.16 sd	21.5	Yes
4	Spratt, King, & Bulat, 2013	Mali	2009-12	75	0.25 sd	23.0	No

Note: reproduced, and updated, from “Will Raising Test Scores in Developing Countries Produce More Health, Wealth, and Happiness Later in Life?”

Our baseline estimate is that a one standard deviation improvement in test scores should raise earnings by roughly 20-30 percent.

We are doing this work because there is little evidence causally linking early skill improvements to adult outcomes in LMICs. But we are not starting from scratch. Studies in high-income countries have already demonstrated this connection, and related research in LMICs provides additional insights.

We reviewed 17 high-quality studies that link cognitive skills to earnings. This research is summarised in Table 2 and shows that literacy is linked to earnings, but estimates vary widely. Interpreting the range of estimates across locations and study designs is not straightforward.

The most comparable interventions (those that improved education quality) suggest that a one standard deviation (SD) increase in skills raises earnings by around 15 percent—but these come from only the US and Norway.
Interventions in the most relevant contexts suggest larger effects, with studies in Ghana and Colombia estimating that each standard deviation increase in skills translates to earnings gains of 40 to 80 percent.
Observational studies from LMICs suggest an average return of about 25 percent, slightly higher than estimates for high-income countries.
The strongest causal evidence (experiments that track individuals from early childhood into adulthood) suggests returns of at least 20 percent, often exceeding 40 percent, though these programs typically involved more than just literacy gains.

Despite variation in findings, the evidence strongly suggests that literacy plays a valuable role in shaping future earnings. A reasonable expectation for LMICs might be a return of 25–30 percent per SD increase in literacy skills.

Table 2. The returns to cognitive skills in existing studies

Type	Study	Location	Intervention	Earnings gain per SD skill increment
A	Danon et al. 2024	Pakistan	None	24 %
A	Evans & Yuan 2017	Multiple LMIC	None	36.5 %
A	Glewwe et al. 2022	China	None	12.9 %
A	Hampf et al. 2017	Multiple OECD	None	26 %
A	Heckman et al. 2015	Multiple OECD	None	18 %
A	Perez-Alvarez 2017	Multiple LMIC	None	28.5 %
A	Perez-Alvarez et al. 2023	Multiple LMIC	None	16-30 %
B	Campbell et al. 2012	US	Abecedarian	46 %
B	Gertler et al. 2021 *	Jamaica	Stimulation & nutrition	> 90%
B	Heckman et al. 2010	US	Perry Preschool	21 %
B	Thompson 2017 *	US	Head Start	42 %
C	Bettinger et al. 2019 *	Colombia	Private school lottery	40 %
C	Duflo et al. 2023	Ghana	Secondary scholarships	81 %
D	Chetty et al. 2011	US	Higher teacher VA	12 %
D	Chetty et al. 2014	US	Higher quality class	9 %
D	Jackson et al. 2018 *	US	Higher spending	14 %
D	Kirkebøen 2022	Norway	Higher teacher VA	20 %

Notes: a more detailed table can be found here, which indicates when the skills measure was taken, the route from study findings to the summary earnings per SD skills figure, and notes. These 17 studies fall into four methodological types. Type A: controlled regression (strengths: many countries, getting better at connecting pre-market skills and later outcomes. weaknesses: no causal identification, omitted variables); Type B: early childhood (strengths: identification, long-term panel. weaknesses: small samples and programs combining many things, inc. with HH; predominantly US); Type C: additional education (strengths: identification, locations, schooling levels. weaknesses: primarily about access increases, which indirectly boost learning); and Type D: improved quality (strengths: direct relevance to marginal quality improvements, identification. weaknesses: all OECD).
* These four studies combine results on learning and results on earnings from different papers; the earnings paper is referenced here and the earlier paper in the linked sheet.

Low attrition rates on longitudinal studies are achievable, with some caveats

Fieldwork involved overcoming a range of logistical and ethical challenges, including recovering original personal data and obtaining approvals for re-contacting participants. Collaboration with original PIs—Moses Oketch and Adrienne Lucas—and institutions was crucial in navigating these steps.

To establish high-quality tracking protocols, we reviewed 44 follow-up studies, consulted with experts, and examined several protocols used in longitudinal surveys. These protocols were tested in 10 pilot sites in Uganda and 12 in Ghana, where we attempted to locate and interview all individuals from the original study.

The results exceeded our expectations, with 86 percent of participants tracked in Ghana and 95 percent in Uganda (Table 3).

Table 3. Results from individual tracking in Ghana and Uganda

Country	Pilot sites	Interview targets	Number tracked	Of which: deceased	Number not tracked	Tracking rate
	(A)	(B)	(C)	(D)	(B)-(C)	(C)/(B)
Ghana	12	284	245	8	39	86%
Uganda	10	636	606	8	30	95%

For comparison, the mean tracking rate across 27 attempts to follow-up participants from experimental studies in low- and middle-income countries was 80 percent (Figure 1). Of the seven studies with an equal or longer follow-up period, only one achieved a higher rate—93 percent after 20 years in Mexico (Araujo & Macours, 2021).

Figure 1. Tracking rates in long-run follow-ups of experiments in low- and middle-income studies (plus our simple mean rate in red)

Notes: This figure shows tracking success rates from 27 follow-ups to experiments in low- and middle-income countries. Studies with “Regular Tracking Only” try to find all individuals from the target sample and do that to exhaustion (of time, budget, or energy). In contrast, studies with a two-phase tracking approach, “Regular & Intensive Tracking” begin with regular tracking before switching to an intensive phase that targets a random subsample of so far unreached individuals and weights their data accordingly. For the “Regular & Intensive” studies, the main marker shows the overall effective tracking rate, combining responses from both phases, and we also indicate the (lower) regular tracking rate achieved, with a small hanging marker.

That said, we also identified clear limits to when and where longitudinal tracking seems feasible.

Many RCTs in our database were cross-sectional, meaning participant names were never recorded. We tested whether we could reconstruct relevant cohorts from school rosters, but missing records and inaccessible schools made this infeasible.

In Liberia, only 16 of the 40 rosters were found (either at schools or from staff who had worked there), leading to high sample attrition. In Mali, security issues and school closures made follow-up impossible in at least one-third of schools. These results suggest that future efforts should prioritise longitudinal studies where individual data is available.

Few education RCTs are powered to study long-term impacts, suggesting the need to pool multiple studies

Finding individuals long after an intervention is encouraging, but it does not automatically mean we can measure the effects we care about. The original studies were never intended to provide evidence on long-term outcomes in isolation. But we’re increasingly confident that the solution lies in combining data from multiple studies.

While we lost two studies (Liberia and Mali) from our pilot, this does not fundamentally weaken our approach. In fact, we can replace them with a stronger study from India, specifically a follow-up of a 2013–14 Pratham learning camp in Uttar Pradesh. This intervention showed large short-term effects (0.7 standard deviations), similar to those measured in Liberia. With a sample covering 245 schools, a follow-up study would be a valuable addition to the initiative.

With a small number of studies, estimating a pooled effect is inherently difficult. We’ve been carefully working through the best way to combine data from three studies, running numerous simulations under different model assumptions and outcome definitions. Rather than focusing on the specific details of each intervention, our core interest lies in understanding the link between early literacy and later-life outcomes.

Pooling across three studies, our estimates suggest that if literacy effects are consistent across contexts, we are powered to detect a 0.15–0.20 standard deviation (SD) increase in earnings for a one SD improvement in literacy. However, if there is moderate variation in effects across studies, the detectable impact increases to 0.30–0.35 SD.

These are sizable impacts, but they align with observed relationships between test scores and later-life outcomes, including earnings effects in datasets like the Kenya Life Panel Survey (0.24 SD), and in the studies previewed earlier in this blog. Similar associations are seen in test-score-to-schooling associations from longitudinal data from Ethiopia, India, Pakistan, Peru, Vietnam, and Kenya (ranging from 0.31 to 0.53 SD).

There is still room for improvement, particularly in how we anticipate and model effect size differences between pooled studies. Over the next year, we will explore how Bayesian methods can improve our synthesis of findings across a small number of studies.

Next steps: Expanding tracking, improving methods, and informing policy

Full-scale tracking will soon begin in Ghana and Uganda, taking advantage of the research infrastructure built during the successful pilot phase. But we’re also looking for more studies to track. We’re currently working with Pratham and the original PIs to explore a follow-up to Esther Duflo, Rukmini Banerji, Abhijit Banerjee, Mark Berry, Harini Kannan, Shobhini Mukerji, Marc Shotland, and Michael Walton’s evaluation of Pratham’s 2013–14 learning camps in Uttar Pradesh.

Several related efforts are also underway, each exploring long-term outcomes of early learning interventions—though they differ in terms of intervention type, context, and the age of participants now being followed. These include:

Jishnu Das and colleagues, who are continuing to track the 2003 Learning and Education Achievement in Punjab Schools (LEAPS) cohort—now in their 20s and 30s—to investigate the long-run links between schooling, early skills (some shaped by an RCT), and a broad range of adult outcomes;
Alex Eble and colleagues, who are following up, seven years on, with participants in highly effective early learning programmes in The Gambia and Guinea-Bissau;
Jason Kerwin, Rebecca Thornton, and colleagues, who are tracking individuals who benefited from the highly effective Northern Uganda Literacy Project, first launched in 2009 and scaled in 2014;
Jacobus Cilliers and colleagues, who are tracking children who took part in South Africa’s Early Grade Reading Study, (and having already reported results on the persistence and emergence of literacy skills some four years post intervention).

Many of these teams are already working on collaboration to develop common tracking approaches and measurement instruments so that when results come in, it’ll be possible to make meaningful cross-context comparisons.

Combining all of these efforts, we’re hopeful that within two to three years, policymakers will have a much better idea about the long-run payoff to improving basic literacy and numeracy skills.

For now, stay tuned, and please watch this space for more updates on Return to Learning.

Topics

Education and Child Well-Being

DISCLAIMER & PERMISSIONS

CGD's publications reflect the views of the authors, drawing on prior research and experience in their areas of expertise. CGD is a nonpartisan, independent organization and does not take institutional positions. You may use and disseminate CGD's publications under these conditions.

Thumbnail image by: GPE/Livia Barton

From Prospective to Prepared Teacher: A Global Study of Initial Teacher Education

Second Annual Research Conference on Global Lead Exposure

BLOG POST

Can Better Test Scores Lift Kids Out of Poverty? The Case for Long-Term Tracking of Education RCTs

Recommended

Blog Post

Will Raising Test Scores in Developing Countries Produce More Health, Wealth, and Happiness Later in Life?

The proliferation of education RCTs in developing countries in the 2000s has created a new opportunity to study long-term impacts

Table 1. Pilot studies for long-term tracking

Our baseline estimate is that a one standard deviation improvement in test scores should raise earnings by roughly 20-30 percent.

Table 2. The returns to cognitive skills in existing studies

Low attrition rates on longitudinal studies are achievable, with some caveats

Table 3. Results from individual tracking in Ghana and Uganda

Figure 1. Tracking rates in long-run follow-ups of experiments in low- and middle-income studies (plus our simple mean rate in red)

Few education RCTs are powered to study long-term impacts, suggesting the need to pool multiple studies

Next steps: Expanding tracking, improving methods, and informing policy

Topics

DISCLAIMER & PERMISSIONS

Events

From Prospective to Prepared Teacher: A Global Study of Initial Teacher Education

Second Annual Research Conference on Global Lead Exposure

BLOG POST

Can Better Test Scores Lift Kids Out of Poverty? The Case for Long-Term Tracking of Education RCTs

Recommended

Blog Post

Will Raising Test Scores in Developing Countries Produce More Health, Wealth, and Happiness Later in Life?

The proliferation of education RCTs in developing countries in the 2000s has created a new opportunity to study long-term impacts

Table 1. Pilot studies for long-term tracking

Our baseline estimate is that a one standard deviation improvement in test scores should raise earnings by roughly 20-30 percent.

Table 2. The returns to cognitive skills in existing studies

Low attrition rates on longitudinal studies are achievable, with some caveats

Table 3. Results from individual tracking in Ghana and Uganda

Figure 1. Tracking rates in long-run follow-ups of experiments in low- and middle-income studies (plus our simple mean rate in red)

Few education RCTs are powered to study long-term impacts, suggesting the need to pool multiple studies

Next steps: Expanding tracking, improving methods, and informing policy

Topics

DISCLAIMER & PERMISSIONS

More Reading

Blog Post

Ministerial Taskforce Renews Commitment to End Violence in and Around Schools

Event

From Prospective to Prepared Teacher

VIRTUAL

Blog Post

Skills for Jobs in Sub-Saharan Africa: An Urgent, Pressing Agenda

WORKING PAPER

Iron and Calcium Supplementation for Reducing Blood Lead Levels

Sign up to get Education and Child Well-Being updates: