28 Communication-efficient distributed estimation of causal effects with high-dimensional data
Yong Chen
Co-Author
University of Pennsylvania, Perelman School of Medicine
Tuesday, Aug 6: 10:30 AM - 12:20 PM
3227
Contributed Posters
Oregon Convention Center
We propose a communication-efficient algorithm to estimate the average treatment effect (ATE), when the data are distributed across multiple sites and the number of covariates is possibly much larger than the sample size in each site. Our main idea is to calibrate the estimates of the propensity score and outcome models using some proper surrogate loss functions to approximately attain the desired covariate balancing property. We show that under possible model misspecification, our distributed covariate balancing propensity score estimator (disthdCBPS) can approximate the global estimator, obtained by pooling together the data from multiple sites, at a fast rate. Thus, our estimator remains consistent and asymptotically normal. In addition, when both the propensity score and the outcome models are correctly specified, the proposed estimator attains the semiparametric efficiency bound. We illustrate the empirical performance of the proposed method in both simulation and empirical studies.
Causal Inference
High-dimensional Statistics
Double robustness
Distributed inference
Communication efficiency
Likelihood approximation
Main Sponsor
Section on Statistics in Epidemiology
You have unsaved changes.