The UN is about to delete literacy and numeracy in early primary school from the list of Sustainable Development Goals because of a technical debate about measurement and a turf war between testing agencies. It shouldn’t.

When the United Nations set the Sustainable Development Goals (SGDs) in 2015, they included an array of targets and indicators by which to measure progress. For SDG 4, which focuses on education, the very first indicator (SDG 4.1.1a) was the proportion of kids in grades 2 and 3 “achieving at least a minimum proficiency level in (i) reading and (ii) mathematics, by sex.”

We’re now just over halfway to the SDG end date in 2030. Ideally, we’d be taking stock of progress on those many (many) targets. But for a lot of key indicators, including SDG 4.1.1a, we don’t really have the data to do that.

Counter-intuitively, indicators without sufficient coverage will get dropped (or rather demoted in the UN’s byzantine SDG monitoring framework). To get the coverage needed, education experts need to agree on how to combine lots of different tests. That’s a difficult technical question. One of us co-authored a whole paper on a Rosetta Stone for Human Capital

It has also sparked a turf war between UN agencies and other entities over whose tests will count and whose won’t. Officially, UNESCO is in charge of measuring SDG 4.1.1a. But as we argue below, the best hope for measuring it comes from civil society and a survey run by UNICEF. At present, we’re in a position where experts from rival agencies and organizations cannot agree on a unified measurement framework for reading and math, and so instead, they’ll just drop the effort altogether.

What makes for a decent learning metric?

We don’t want to wave away the genuine technical hurdles here. But it’s worth reflecting on what the UN really needs from a learning metric. The basic characteristics you want out of a test are

  1. A reliable, meaningful measure of student learning
  2. Representative coverage, which – to foreshadow a debate here – is a point in favor of household- rather than school-based samples in low-income contexts, where net enrollment or at least attendance might be highly imperfect.
  3. Comparability over time, so that we can tell if things are improving or not
  4. Perhaps least important, but nice to have: comparability across countries

Ideally, there would be some sort of big global learning assessment that spanned every country on earth, and was repeated on a regular basis. Since this doesn’t exist, we’re left with a patchwork of regional and national assessments.

Inevitably, building that patchwork is going to require compromises on points 1-4 above. The most respectable part of the current debate is between people who are super rigorous about #1 and #4 (the psychometric properties of reliable assessments) versus those who would make some practical compromises. 

The less respectable part of the debate is a naked turf war, with various groups lobbying to get their own assessment counted in the SDG framework (and the funding and prestige that would come with that), and UNESCO—the official keeper of SDG indicator 4.1.1a—trying to avoid ceding too much ground to UNICEF.

The existing global patchwork of early-grade learning measures has a lot of holes, especially in Africa  

The map shows which countries have conducted various kinds of learning assessments since the launch of the SDGs in 2015. Among developing countries, two quasi-governmental efforts at the regional level have filled in some of the learning map. Much of Latin America is covered by the tests run by the Laboratorio Latinoamericano de la Evaluación de Calidad de Educació, or LLECE, which are somewhat unique among international assessments in focusing on early primary grades. And in West Africa, the revitalization of the Programme d’Analyse des Systėme Éducatifs de la Confemen, aka PASEC, has helped fill in many gaps as well. 

The core argument here is about whether to let the orange (ASER), pale green (EGRA), and red (UNICEF) dots count. The purists want only the dark green (LLECE) and blue (PASEC) dots (i.e., standardized tests that rely on item response theory). But remaining pure also likely means dropping the SDG 4.1.1a indicator from the official list.

Civil society can fill some of the gaps

In South Asia and East Africa, civil society has taken the lead to measure learning where governments weren’t. Specifically, millions of kids whose learning outcomes wouldn’t otherwise be tracked in any representative, comparable way at all have been tested as part of the ASER initiative pioneered by the NGO Pratham in India. The simple test of numeracy and literacy administered cheaply and at large scale has spread to Pakistan, across East Africa, and beyond through the PAL Network of like-minded civil society groups with philanthropic funding. These tests have important limitations—they only cover rural areas, and the assessments are rudimentary. They are nonetheless still better than nothing. 

And that still leaves a bunch of gaps: countries with no representative, internationally comparable data on learning outcomes for kids in early primary school

UNICEF household surveys are another good option, especially in poorer countries with low learning levels and lots of kids out of school

In 2016, UNICEF introduced a new module on literacy and numeracy among children  into its standard household survey template. Those surveys, known as Multiple Indicator Cluster Surveys or MICS, are commonly conducted every few years in most low- and lower-middle income countries. They’re the source of a lot of what we know about global challenges like malnutrition, child mortality, and vaccination coverage rates

The UNICEF Foundational Learning module is adapted from the Early Grade Reading and Mathematics assessments (EGRA and EGMA), and assesses children aged 7-14. It includes Oral Reading Accuracy, Reading Comprehension, Number Identification, Quantitative Comparisons, Addition, and “Missing Number”. Direct comparisons show a strong similarity between children’s scores on the UNICEF assessment and on the more established EGRA / EGMA.

Potential coverage of SDG 4.1.1a if the UN does or doesn’t include “less rigorous” survey measures

Including those ASER, EGRA, and UNICEF surveys in the SDG 4.1.1a map of learning outcomes would significantly improve total coverage. Without them, the map includes 11 low-income countries, 14 lower-middle, and 21 upper-middle income countries. Add in the ASER, EGRA, and UNICEF surveys, and those numbers rise to 19, 31, and 30. That amounts to about a 74 percent increase in country coverage (from 46 to 80 countries).

Most of the gain in coverage comes from UNICEF. So whilst the debate might ostensibly be fought on psychometric grounds, there’s more than a hint of a sad bureaucratic turf war. 

Other SDGs lack strictly comparable indicators too

Education isn’t unique in lacking perfect data. SDG 1 - ending poverty - includes an indicator on getting below national poverty lines - which are all different (SDG 1.2.1). SDG 2—ending hunger—relies on model-based estimates of undernourishment by FAO to smooth out the imperfections of national household expenditure surveys. Similarly, maternal mortality estimates (SDG 3—health) use a bayesian approach developed by WHO and partners to combine the different sources of data and methods used by different countries. Even within education, other indicators rely on a combination of different household surveys and official government data (collected in different ways) and with differing definitions of what constitutes the outcome of interest (such as what constitutes pre-school age). 

Test scores aren’t everything, but ignoring them doesn’t help anyone

There’s more to schooling than test scores. Here at the Center for Global Development, we’ve published a lot of pieces in the past few years arguing for greater attention to the epidemic of physical and sexual violence in schools, more money for school feeding programs to address nutritional shortages, and better monitoring of lead exposure. While they may all affect learning, you can’t make much progress on any of those things if you focus only on test scores.

Still, you can’t run a serious education system without test scores. Kids everywhere need to know how to read and do basic arithmetic. If they’re not getting those skills in early primary school, then education systems are failing them. Goals and targets are not a panacea. But dropping foundational learning from the Sustainable Development Goal on education suggests a lack of seriousness around the issue. 




CGD blog posts reflect the views of the authors, drawing on prior research and experience in their areas of expertise. CGD is a nonpartisan, independent organization and does not take institutional positions.