CGD senior fellow David Roodman has won the inaugural Stata Journal Editors’ Prize for what the editors termed “two outstanding papers:” How to do xtabond2: An introduction to difference and system GMM in Stata and Fitting fully observed recursive mixed-process models with cmp. Each article describes how to use a computer program he wrote to extends Stata, a widely used statistical toolset. Roodman’s papers have been cited thousands of times. The prize citation concludes:
“David Roodman has provided to the Stata community:
• Excellent programs substantially extending the functionality available to users
• Programs that incorporate innovative and sophisticated Mata programming
• Excellent accompanying articles in the Stata Journal that not only explain the
programs but also are excellent free-standing pedagogic pieces in their own right”
CGD president Nancy Birdsall congratulated Roodman, noting that the prize is ”a welcome reminder that at CGD we have found a way to attract world-class scholars who bring rigor as well as passion to this century's crucial challenge: reducing poverty and inequality in the world.”
Roodman explains his work and discusses his reaction to winning the prize in the Q&A below.
What is the “Stata Journal Editors’ Prize”?
Stata is software for doing statistics. It’s popular in academia, especially in the social sciences, which is why CGD uses it. StataCorp’s strategy is interesting: as a for-profit corporation it has worked to build a public, free-software ecosystem on its platform. People write and share add-on packages for it. Probably the strategy works well in the academic market because for researchers, reference to one’s work by others is a valuable professional commodity. To further reward contributors to this community, StataCorp sponsors an academic journal that publishes articles about user-written programs and the theory behind them. For the same reason, the journal this year inaugurated an annual award for the best contributions.
So why did you win the inaugural prize?
Partly out of luck. My two articles hit the journal in 2009 and 2011, the years that bracket the period for the inaugural prize. Aside from that, I think the editors appreciate the contributions for the professionalism I brought to the computer programming and the pedagogy I brought to the write-ups. I am a mathematician and programmer first (my degree is in math) and a social scientist second. I think that most contributors are the other way around. For good mathematicians and programmers, doing things right is paramount. Elegance is akin to truth. I have strived to make my programs elegant, powerful, and flexible. I’ve also tried to respond quickly to suggestions from users, in the spirit of continuous improvement. Meanwhile, I thrive on teaching the complex things I’ve figured out to others. That’s the common thread from this abstruse mathematical work to my broad book on microfinance to the Commitment to Development Index.
Why did you write the new programs for Stata?
I arrived at CGD in 2002 knowing little about the application of statistics to social sciences, what is called econometrics. One of my first projects was to help Senior Fellow Bill Easterly reconstruct an important study that found that foreign aid sped economic growth in countries with good economic policies. It was a powerful finding, which had influenced my own understanding. But when we added more years of data the key finding disappeared. This experience made a strong impression on me. I taught me that replication is a great way to learn econometrics. And it taught me that econometric work can be a black box that fools even the people who do it.
You are most known in the Stata community for “xtabond2.” What is that?
I wrote it in 2004 to implement a statistical method not then available in Stata. Called System GMM, it is used on data sets generated by observing many individuals—people, firms, or countries—a few times. The case at hand: data from some 100 countries in the six five-year periods between 1970 and 2000 on foreign aid and various economic and political indicators. The method makes a particular attack on a central econometric challenge: inferring causation from correlation. That is to say, statistics are only really good at telling us about patterns, such as whether faster-growing countries receive more aid. It’s much tougher to figure out which way the causal arrows go: is aid making countries grow faster or is faster growth merely attracting more aid? Ironically, the upshot of my making it easier for people to do System GMM was a distrust of studies that use it. Still, researchers have to make the best of the data they have, and sometimes the best is System GMM, so after I completed the CGD working paper about xtabond2 that became a Stata Journal article, I wrote A Note on the Theme of Too Many Instruments, to promote better use of it.
And what is your other program, “cmp,” for?
Econometrics is most natural when the outcomes studied vary across a wide range of numerical values. Examples are household net worth and IQ. But for some outcomes the range is hemmed in—you can’t borrow a negative amount of money—or discontinuous—you can’t be a little bit pregnant. And sometimes researchers want to analyze the determinants of several such variables at once. In the example that inspired the command, Mark Pitt and Shahidur Khandker, set up a model in which several factors determine how much microcredit a Bangladeshi household would borrow, which in turn could affect whether, say, a child was in school, which is a binary characteristic like pregnancy. They model in two stages in an attempt, again, to surmise causation from correlation (explained here). They performed the complex math using custom code now reportedly lost. So I wrote cmp to fit this model and more. It is a way to mix and match models for these sorts of outcomes. It’s flexible, a sort of smartphone for Stata. It’s been used to study everything from inequality in health in the U.S. to the effect of remittances on poverty in Ghana.
What does the award mean to you?
I started programming when I was 12, on a mainframe computer under my father’s tutelage. At 20, I interned at Microsoft for a summer. It was fun, but I wanted to do something more meaningful than beating another software maker and parking an expensive car outside the office building. So I long struggled to meld my aptitudes to a larger purpose—as a teacher put it, to program for the revolution. In addition, my path at CGD has been unorthodox. I have learned econometrics by coding it, in order to engage in economics not as a producer, as a graduate economics program would have made me, but as an annoyingly demanding consumer. I wrote the programs in order to rerun and scrutinize important studies on the impact of foreign aid and microfinance. I felt this depth of review was necessary to inform my judgments on the substantive matters. But despite the sense of compulsion, I often questioned my judgment about my choice of direction. It is gratifying now to see that my work is of some use to others.
Editor’s Note: See also Ich bin ein Über-Geek