Diagnosing the role of observed distribution shift in scientific replications
Ying Jin
First Author
Stanford University
Ying Jin
Presenting Author
Stanford University
Tuesday, Aug 6: 2:05 PM - 2:20 PM
2971
Contributed Papers
Oregon Convention Center
Many researchers have identified distribution shift as a likely contributor to the reproducibility crisis in behavioral and biomedical sciences. The idea is that if treatment effects vary across individual characteristics and experimental contexts, then studies conducted in different populations will estimate different average effects. This paper uses ``generalizability" methods to quantify how much of the effect size discrepancy between an original study and its replication can be explained by distribution shift on observed unit-level characteristics. More specifically, we decompose this discrepancy into ``components" attributable to sampling variability (including publication bias), observable distribution shifts, and residual factors. We compute this decomposition for several directly-replicated behavioral science experiments and find little evidence that observable distribution shifts contribute appreciably to non-replicability. In some cases, this is because there is too much statistical noise. In other cases, there is strong evidence that controlling for additional moderators is necessary for reliable replication.
Replicability
Distribution shift
Treatment effect
Generalizability
Main Sponsor
IMS
You have unsaved changes.