Estimation and Inference in Cluster Randomized Trials with Few Large Clusters for Binary Outcomes

Donna Spiegelman Co-Author
Yale School of Public Health
 
Fan Li Co-Author
Yale School of Public Health
 
Zachary Frere First Author
 
Zachary Frere Presenting Author
 
Thursday, Aug 7: 9:20 AM - 9:35 AM
2767 
Contributed Papers 
Music City Center 
Cluster randomized trials (CRTs) are essential for evaluating cluster-level interventions in medicine and public health. However, many CRTs include only a few clusters, such as hospital-based interventions where a small number of large hospitals are randomized. Conventional methods often require at least 30–40 clusters for reliable inference. This study uses simulations to explore statistical methods for CRTs with binary outcomes when there are ≤10 clusters with large sizes. We investigate whether asymptotic properties hold in this challenging yet common scenario.
We compare generalized estimating equations (GEE), generalized linear mixed models (GLMM), cluster-level summaries (CLS), and randomization-based methods (RB). Simulations show that GLMM and CLS performed best for Type 1 error and power. RB maintained Type 1 error but lagged in power compared to CLS and GLMM. GEE had the worst Type 1 error, with the standard sandwich variance estimator inflating Type 1 error, while bias-corrected versions tended to underestimate it. These findings can better guide the choice of analytic methods for CRTs with few but large clusters, ensuring more robust inference in real-world settings

Keywords

Cluster Randomized Trials

Multilevel Models

Type I Error

Simulation Study

Few Clusters

Inference 

Main Sponsor

ENAR