Economists are increasingly using publicly shared data and code to check each other’s work, an exercise often called ‘replication’ testing. But this much-needed trend has not been accompanied by a consensus about what ‘replication’ means. If a follow-up study does not ‘replicate’ an original result, according to current usage of the term, this can mean anything from an unremarkable disagreement over methods to scientific incompetence or misconduct.
This paper proposes an unambiguous definition of replication. Many social scientists already use the term in the way suggested here, but many more do not. The paper contrasts this definition with decades of unsuccessful attempts to standardize terminology, and argues that many prominent results described as replication tests should not be described as such. It argues that professional associations should formally adopt this definition, thereby improving incentives for researchers to conduct more and better replication tests.