Enhancing Causal Effect Estimation with Diffusion-Generated Data

Xiaotong Shen Co-Author
University of Minnesota
 
Wei Pan Co-Author
University of Minnesota
 
LI Chen Speaker
 
Sunday, Aug 3: 4:45 PM - 5:05 PM
Invited Paper Session 
Music City Center 

Description

Estimating causal effects from observational data is inherently challenging due to the lack of observable counterfactual outcomes and even the presence of unmeasured confounding. Traditional methods often rely on restrictive, untestable assumptions or necessitate valid instrumental variables, significantly limiting their applicability and robustness. In this paper, we introduce Augmented Causal Effect Estimation (ACEE), an innovative approach that utilizes synthetic data generated by a diffusion model to enhance causal effect estimation. By fine-tuning pre-trained generative models, ACEE simulates counterfactual scenarios that are otherwise unobservable, facilitating accurate estimation of individual and average treatment effects even under unmeasured confounding. Unlike conventional methods, ACEE relaxes the stringent unconfoundedness assumption, relying instead on an empirically checkable condition. Additionally, a bias-correction mechanism is introduced to mitigate synthetic data inaccuracies. We provide theoretical guarantees demonstrating the consistency and efficiency of the ACEE estimator, alongside comprehensive empirical validation through simulation studies and benchmark datasets. Results confirm that ACEE significantly improves causal estimation accuracy, particularly in complex settings characterized by nonlinear relationships and heteroscedastic noise.

Keywords

Causal effect estimation

Data augmentation

Unmeasured confounding

Generative models

Transfer learning