10. All models are wrong, but some are useful: An Evaluation of Estimators for the Conditional Average Treatment Effect (CATE).

Conference: Women in Statistics and Data Science 2025
11/12/2025: 3:00 PM - 4:00 PM EST
Speed 

Description

Inferring heterogeneity of treatment effect is a popular secondary aim of clinical trials. Recently, many trial analyses have moved from traditional subgroup analyses to more modern assessments of heterogeneity using machine learning. While there are several such methods available to estimate conditional average treatment effects (CATEs) in clinical trials, these methods are often applied in trial settings that have lower sample sizes than were considered in the simulations of corresponding seminal methodological work, making the validity of inference in these settings unclear. To provide guidance to practitioners, we conducted a simulation study to evaluate the performance of different regression and machine learning estimators for the CATE, including ordinary least squares (OLS), Bayesian Additive Regression Trees (BART), and causal forests with both default settings and cross-validation based hyperparameter tuning, in a variety of settings across a range of sample sizes.

We evaluated 95% confidence interval (CI) coverage, bias, and variance under linear and non-linear data generating mechanisms (DGM) in the presence of 0 to 40 nuisance covariates and 0 to 16 effect modifying covariates. We found that while tree-based ensembles like causal forests can be quite flexible to linear or nonlinear settings, they can have meaningfully impaired coverage in many settings at sample sizes which constitute most trial applications. As expected, OLS has superior performance under linear DGMs but has poor performance under nonlinear DGMs. We conclude with recommendations for practitioners.

Keywords

Heterogeneous Treatment Effects

Machine learning

causal inference

simulation study

causal forests

advice for practitioners 

Presenting Author

Lisa Levoir, Vanderbilt University

First Author

Lisa Levoir, Vanderbilt University

CoAuthor(s)

Andrew Spieker, Vanderbilt University Medical Center
Bryan Blette, Vanderbilt University Medical Center

Target Audience

Beginner

Tracks

Knowledge
Women in Statistics and Data Science 2025