Diagnosing the role of observed distribution shift in scientific replications

Dominik Rothenhaeusler Co-Author
 
Ying Jin First Author
Stanford University
 
Ying Jin Presenting Author
Stanford University
 
Tuesday, Aug 6: 2:05 PM - 2:20 PM
2971 
Contributed Papers 
Oregon Convention Center 
Many researchers have identified distribution shift as a likely contributor to the reproducibility crisis in behavioral and biomedical sciences. The idea is that if treatment effects vary across individual characteristics and experimental contexts, then studies conducted in different populations will estimate different average effects. This paper uses ``generalizability" methods to quantify how much of the effect size discrepancy between an original study and its replication can be explained by distribution shift on observed unit-level characteristics. More specifically, we decompose this discrepancy into ``components" attributable to sampling variability (including publication bias), observable distribution shifts, and residual factors. We compute this decomposition for several directly-replicated behavioral science experiments and find little evidence that observable distribution shifts contribute appreciably to non-replicability. In some cases, this is because there is too much statistical noise. In other cases, there is strong evidence that controlling for additional moderators is necessary for reliable replication.

Keywords

Replicability

Distribution shift

Treatment effect

Generalizability 

Main Sponsor

IMS