A Bayesian causal forest approach for modeling heterogeneous effects in RCTs with missing data

Chung-Chou Chang Co-Author
University of Pittsburgh
 
Victor Talisa First Author
University of Pittsburgh
 
Victor Talisa Presenting Author
University of Pittsburgh
 
Tuesday, Aug 5: 2:05 PM - 2:20 PM
2744 
Contributed Papers 
Music City Center 

Description

Randomized controlled trials (RCTs) are increasingly being used to model conditional average treatment effects (CATEs) using baseline patient characteristics as predictors, generating evidence that could be used to inform personalized treatment strategies. However, missing data in the baseline predictors is ubiquitous in RCT datasets, creating challenges for model derivation and evaluation in this context. We develop a family of extensions to the Bayesian Causal Forest (BCF) model that incorporates missing data priors and is compatible with continuous and binary outcomes. Furthermore, we incorporate flexibility beyond the usual missing at random assumption by exploiting the binary decision trees that form the basis of BCF's model structure. Compared to existing strategies for missing data in CATE modeling, we show via simulations that our approach leads to improved accuracy and quantification of uncertainty when summarizing the evidence for heterogeneous CATEs in RCTs with missing values in the predictor variables. Finally, we demonstrate the practical utility of our method via analysis of existing trial datasets.

Keywords

Bayesian statistics

Missing data

Conditional average treatment effects

Randomized controlled trials

Statistical learning 

Main Sponsor

Biopharmaceutical Section