We randomly sampled 37 evaluations and applied a standardized assessment approach with two reviewers rating each evaluation. To answer questions about evaluation quality, we used three criteria from the evaluation literature: relevance, validity, and reliability. We constructed four aggregate scores (on a three-point scale) to correspond with these criteria. Overall, we found that most evaluations did not meet social science standards in terms of relevance, validity, and reliability; only a relatively small share of evaluations received a high score.
With rigorous economic research and practical policy solutions, we focus on the issues and institutions that are critical to global development. Explore our core themes and topics to learn more about our work.