Recent Advances of How to Incorporate Auxiliary Data: External Controls and Beyond

Ima Placeholder Chair
ASA-Placeholder Record
 
Jiwei Zhao Organizer
University of Wisconsin-Madison
 
Thursday, Aug 8: 8:30 AM - 10:20 AM
1427 
Invited Paper Session 
Oregon Convention Center 
Room: CC-257 

Abstracts


Applied

Yes

Main Sponsor

ENAR

Co Sponsors

Biometrics Section
Biopharmaceutical Section

Presentations

Doubly Safe Estimation for the Average Treatment Effect on the Treated with External Control Data under High-Dimensionality

Randomized controlled trial (RCT) has been a gold standard for causal discovery in various biomedical studies. In this paper, we consider the situation that some external control data, possibly with a much larger sample size, are available. However, the standard doubly robust estimator for ATT incorporating external controls might be even less efficient than the naive doubly robust estimator without using the external controls. This is not ideal because it means the incorporation of external controls might be harmful for our estimation. To fix this issue, we propose a novel doubly robust estimator which is guaranteed to be always more efficient than the naïve doubly robust estimator without using the external controls. Further, if all models are correct, the proposed estimator is the same as the standard doubly robust estimator incorporating external controls, and it is semiparametrically efficient. The asymptotic theory developed in this paper, including both estimation and statistical inference, is under the general high-dimensional confounder situation. We conduct comprehensive simulation studies, as well as a real data application, to illustrate our proposed methodology. 

Speaker

Jiwei Zhao, University of Wisconsin-Madison

Leveraging External Data to Augment a Single Arm Study or the Control Arm of a Randomized Clinical Trial

Incorporating external data into clinical trials has mainly been through using various matching methods for baseline characteristics to establish an external control arm or to augment a single-arm study or the control arm of a randomized controlled trial (RCT). However, matching the baseline characteristics between the trial subjects and the external subjects can only guarantee that the external subjects leveraged are comparable on the baseline characteristics. The difference between the two data sources may still exist due to contemporaneous and operational characteristics that are not captured in the baseline. Such differences are usually reflected in the outcomes data rather than in the subjects' baseline characteristics. In this talk, we present novel propensity-score integrated methods for augmenting a single-arm study first, and then extend it to augmenting the control arm of an RCT. The methods are explained using illustrative examples. 

Speaker

Ram Tiwari, Bristol Myers Squibb

WITHDRAWN Distributed Empirical Likelihood Inference with Massive Data

Empirical likelihood is a very important nonparametric approach which is of wide application. However, it is hard and even infeasible to calculate the empirical log-likelihood ratio statistic with massive data. The main challenge is the calculation of the Lagrange multiplier. This motivates us to develop a distributed empirical likelihood method by calculating the Lagrange multiplier in a multi-round distributed manner. It is shown that the distributed empirical log-likelihood ratio statistic is asymptotically standard chi-squared under some mild conditions. The proposed algorithm is communication-efficient and achieves the desired accuracy in a few rounds. Further, the distributed empirical likelihood method is extended to the case of Byzantine failures. A machine selection algorithm is developed to identify the worker machines without Byzantine failures such that the distributed empirical likelihood method can be applied. The proposed methods are evaluated by numerical simulations and illustrated with an analysis of airline on-time performance study and a surface climate analysis of Yangtze River Economic Belt. 

Speaker

Qihua Wang, Academy of Mathematics and Systems Science, Chinese Acacdemy of Sciences

Sensitivity Analysis Framework for Unmeasured Confounding when Integrating External Controls in Randomized Controlled Trials

Integrating external control subjects with randomized controlled trials (RCTs) can offer increased efficiency to estimate causal treatment effects and can overcome certain limitations of traditional RCTs, but may also be subject to potential biases and inflated type I error due to heterogeneity between the data sources. To mitigate bias and better control type I error rates, dynamic borrowing approaches have been proposed that leverage similarity between baseline or outcome variables between concurrent and external controls to determine the degree of borrowing. These have been shown to result in better bias control and can result in greater power when comparison functions are correctly specified. However, misspecification due to unmeasured confounding may lead to underestimating the degree of data heterogeneity and potential over-borrowing. To evaluate the robustness of statistical inference using external control data, we propose a fully Bayesian sensitivity analysis framework to evaluate the potential impact of unmeasured confounding under the Bayesian power prior with subject-specific weights.  

Speaker

Mingyang Shan, Eli Lilly and Company

Mitigating Bias in Treatment Effect Estimation: Strategies for Utilizing External Controls in Randomized Trials

In recent years, real-world external controls (ECs) have gained popularity to enhance the efficacy of randomized controlled trials (RCTs), particularly in scenarios involving rare diseases or situations where equitable randomization is unfeasible or unethical. However, the suitability of ECs compared to RCTs varies, necessitating cautious consideration before utilizing ECs to avoid introducing substantial bias into treatment effect estimation. A central challenge lies in the potential incongruity of outcomes between concurrent controls (CCs) and ECs, even after accounting for covariate disparities, often attributable to latent confounding variables. This talk delves into a range of methodologies designed to mitigate the unknown biases associated with ECs. These methodologies encompass pre-testing, bias function modeling, and selective borrowing, all framed within the context of semiparametric models. These proposed strategies collectively form an essential toolkit for practitioners aiming to incorporate ECs effectively, offering a comprehensive framework to navigate their integration.
 

Speaker

Shu Yang, North Carolina State University, Department of Statistics