School Textbooks in Low- and Middle-Income Countries Are Overwhelmingly Sexist

It’s no surprise that books used in schools in many countries have gender biases. But in a new CGD working paper we document exactly how much and what kind of bias exists across over 1,200 books from 34 anglophone countries. This includes high-income countries such as the US, UK, and Australia, and poorer countries such as India, Pakistan, Uganda, and South Africa. We uncover three consistent patterns of bias:

  1. Women and girls are less present. Across almost all textbooks studied, female characters occurred less frequently than male ones. On average, only around 30 percent  of gendered terms were female.
  2. Women are associated with domestic roles, family, and appearance. Female terms are more closely linked to words like "wedding", "home", and "beautiful", while male terms associate more with words like "leader", "authority", and "career". Amongst occupations, men are more likely to be shown as scientists.
  3. Women are depicted in more passive roles. Female characters were less likely to be the active subject of sentences, with only 24 percent of female terms capitalized compared to 29 percent of male terms.

The degree of gender imbalance in textbooks correlates strongly with country-level measures of gender inequality and women's rights.

Figure 1: Countries in our corpus of books

Note: Books from the 34 countries included in our corpus are coloured in teal. This includes Afghanistan, Australia, Bangladesh, Belize, Bhutan, Dominica, Ethiopia, Guyana, India, Jamaica, Kenya, Kiribati, Lesotho, Liberia, Malawi, Maldives, Namibia, Nigeria, Pakistan, Papua New Guinea, Rwanda, Samoa, Sierra Leone, Solomon Islands, South Africa, South Sudan, Sri Lanka, St Kitts and Nevis, Tonga, Uganda, United Kingdom, United States, Zambia, and Zimbabwe.

Women and girls are less present in textbooks

Across our full sample of books there are over twice as many male words (eg, he, him, his) as there are female words (e.g. she, her, hers). There is also substantial variation across countries. After adjusting for book length, grade, and subject, the countries with the lowest representation of women and girls are Afghanistan, Pakistan, Sri Lanka, and South Sudan, where less than one in three gendered words are female (Figure 2).

Figure 2: Share of female words in different countries

Note: This figure shows the predicted share of gendered words that are female by country. This measure is first calculated for each individual book - which are then estimated as a function of country, subject, grade, and (log) book length. We exclude countries with fewer than five books in our corpus.

Women appear more often in subjects like home economics

When we examine differences across textbooks in various subjects, we find that the subject with the highest representation of women is home economics. Religion had the least representation of women, and subjects such as math and science had less than equal representation (Figure 3). Shorter books also tend to have fewer females.

Figure 3: Share of female words in different subjects

Note: This figure shows the predicted share of gendered words that are female by subject. This measure is first calculated for each individual book - which are then estimated as a function of country, subject, grade, and (log) book length. We exclude countries with fewer than five books in our corpus.

Women are shown as librarians and illustrators, men as scientists

To look at stereotypes, we compare our list of gendered words (he, she, … etc.) with a list of common occupations (teacher, doctor, … etc.), and count the number of times a gendered word occurs in the same sentence as an occupation. So if there were 5 sentences with the words “She” and “Doctor” but only 2 sentences with “He” and “Doctor”, this would represent a female bias. We can then compare the imbalance between how frequently male and female words co-occur with different occupations. Across the full corpus, the most female-associated occupations include housekeeper, thatcher, technician, librarian, supervisor, jewellery, illustrator, printer, nurse, and consultant. The most male-associated occupations include botanist, caretaker, navigator, porter, mathematician, astronomer, physicist, economist, surveyor, and blacksmith. Might the under-representation of women in scientific occupations in real life have something to do with the role models that girls are exposed to in their school books?

Figure 4: Top 10 most female- and male-biased jobs

Women are shown as “beautiful”, men as “powerful” and “complex”

Next, we look beyond occupations to see how men, women, boys, and girls are described more broadly. Using a technique known as “part-of-speech-tagging” we can identify the most common adjectives and verbs used to describe men and women in books. The adjective with the biggest skew to women is “beautiful” and that with the biggest skew towards men is “complex”. The verbs with the largest female skew are “cooking”, “cooked”, “sang”, and “marry”, whilst those for males are “refracted”, “preached”, “revealed”, and “reflected”.

Figure 5: Adjectives and verbs with the most relative mentions for each gender

An alternative tool is “word embeddings”, which estimates quantitatively how similar different words are. We compare our list of gendered words to sets of words related to achievement, appearance, work, and home. Almost all of the achievement and work-related words are estimated as being more similar to male words, and almost all of the home and appearance-related words are more similar to female words.

Figure 6: Similarity between gendered words and words related to achievement, appearance, work, and home

Note: This figure shows how terms relating to four themes are associated with gender terms in our embeddings. Male bias is calculated as the difference between the average cosine similarity of the theme word with the set of male gender terms, and the average similarity of the theme word with the set of female gender terms. Confidence intervals are calculated as the standard deviation for this statistic, over 50 bootstrap samples, where samples are generated by sampling all sentences in our corpus with replacement.

Lessons for donors and governments

Getting all girls into school remains a high-profile global priority. But what girls (and boys) are being taught when they get there clearly matters too. The textbooks that we analyze form the core basis of lessons in many countries. Modern natural language processing techniques like these allow for a rapid and scalable audit of large volumes of text, identifying where there are issues that need to be addressed.

In many lower-income countries school books are funded by foreign aid programmes. We find that these donor-funded books perform slightly better, but we do still see bias with under-representation of women and girls. Frustratingly, many donor-funded and publicly-funded books are not freely available for analysis. Our research was made possible by a global move during COVID lockdowns to make materials available online, but there are still books that are not open access, even where agreements exist stating that they should be.

Removing gender bias from school textbooks won’t solve gender inequality by itself. But it could be a low-cost and scalable way of reaching millions of children, and normalizing progressive gender roles, rather than exposing them to regressive stereotypes.  Officially approved school books represent a statement of official intent. Even if books had no impact on gender norms, do we really want official government materials to be promoting a patriarchal worldview? It seems unlikely that girls' education will achieve its full promise if girls are being shown that they don’t belong in positions of achievement and authority.


CGD blog posts reflect the views of the authors, drawing on prior research and experience in their areas of expertise. CGD is a nonpartisan, independent organization and does not take institutional positions.

Image credit for social media/web: Ashutosh / Adobe Stock