Causal Inferences and Robust Estimators

Jingyu Cui Chair
Yale School of Public Health
 
Tuesday, Aug 5: 8:30 AM - 10:20 AM
4087 
Contributed Papers 
Music City Center 
Room: CC-103A 

Main Sponsor

Section on Statistics in Epidemiology

Presentations

Neyman-Orthogonal Changes-in-Changes Estimators for Causal Inference in Panel Data

We build on the influential changes‐in‐changes (CiC) framework of Athey and Imbens (2006) to estimate the average treatment effect on the treated (ATT) in panel‐data settings with unmeasured confounding. CiC has been a powerful alternative to difference‐in‐differences (DiD), as it relaxes the parallel‐trends requirement. At the same time, most existing implementations of CiC assume (i) a scalar unobserved confounder and (ii) a monotonic relation between that confounder and the outcome. To broaden the applicability of CiC, we make two key contributions. First, we show that the ATT remains nonparametrically identified under a set of novel conditions that allow for multivariate or non‐monotonic unmeasured confounders. Second, we propose a semiparametric estimator that is Neyman orthogonal with respect to infinite‐dimensional nuisance functions. This estimator can be applied with continuous measured covariates and modern machine‐learning tools while preserving valid inference. We illustrate our approach by studying how mass shootings influence voter behavior in U.S. presidential elections, a setting where voter sentiment is complex and only partly observed. 

Keywords

Difference-in-Differences

Unmeasured Confounding

Semiparametric Theory

Policy Evaluation 

Co-Author

Eric Tchetgen Tchetgen, University of Pennsylvania

First Author

Jinghao Sun, University of Pennsylvania

Presenting Author

Jinghao Sun, University of Pennsylvania

Accounting for outcome spillover for causal inference with continuous spatiotemporal processes

Achieving causal inference for processes that generate continuous spatiotemporal point process data is challenging. Current methods rely on data discretization and assuming points do not interact. We demonstrate that, in a highly general parametric setting, causal inference with observational spatiotemporal data in the presence of arbitrary outcome spillover is feasible. To do so, we construct a general framework for novel causal estimands of outcomes of interest using results from point process theory, prove theoretical properties necessary to establish rigorous hypothesis testing and demonstrate practical estimability. Our proposed framework accommodates observational and experimental data, random and non-random treatment mechanisms, a general class of model specifications including those that allow for interaction between points, and state spaces ranging from subsets of $\mathbb{R}^d$ to linear networks. This work is pertinent to applications as diverse as epidemiology and finance, enabling previously impossible causal inference on rich continuous spatiotemporal data. 

Keywords

Causal Inference

Spillover

Point Process

Hawkes Process

Epidemics

Interference 

Co-Author(s)

Duncan Clark
Martin Hazelton, University of Otago

First Author

Conor Kresin, UCLA

Presenting Author

Duncan Clark

Semiparametric Sieve Estimation for Survival Data with Two-layer Censoring

Disease registry data provide important information on the progression of disease conditions. However, reports of death or dropout of patients enrolled in the registry are always subject to a noticeable delay. Reporting delays, together with the administrative censoring that arises from a freeze date in data collection, lead to two layers of right censoring in the data. The first layer results from random dropout and acts on the survival time. The second layer is the administrative censoring, which acts on the summation of the reporting delay and the minimum of the survival time and random dropout time. The heterogeneities among patients further complicate data analysis. This paper proposes a novel semiparametric sieve method based on phase-type distributions, in which covariates can be readily accommodated by the accelerated failure time model. A well-orchestrated EM algorithm is developed to compute the sieve maximum likelihood estimator. We establish the consistency and rate of convergence of the proposed sieve estimators, as well as the asymptotic normality and semiparametric efficiency of the estimators for the regression parameters. Comprehensive simulations and a real example of lung cancer registry data are used to demonstrate the proposed method. The results reveal substantial biases if reporting delays are overlooked. 

Keywords

Phase-type distribution

Reporting delay

Sieve estimator 

Co-Author(s)

Jie Hu, University of Pennsylvania
Tingyin Wang
Yong Chen, University of Pennsylvania, Perelman School of Medicine

First Author

Yudong Wang, University of Pennsylvania, Perelman School of Medicine

Presenting Author

Yudong Wang, University of Pennsylvania, Perelman School of Medicine

WITHDRAWN Double Robust Variance Estimation with Parametric Working Models

Doubly robust estimators have gained popularity in the field of causal inference due to their ability to provide consistent point estimates when either an outcome or exposure model is correctly specified. However, for nonrandomized exposures the influence function based variance estimator frequently used with doubly robust estimators of the average causal effect is only consistent when both working models (i.e., outcome and exposure models) are correctly specified. In this presentation, the empirical sandwich variance estimator and the nonparametric bootstrap are demonstrated to be doubly robust variance estimators. That is, they are expected to provide valid estimates of the variance leading to nominal confidence interval coverage when only one working model is correctly specified. Simulation studies illustrate the properties of the influence function based, empirical sandwich, and nonparametric bootstrap variance estimators in the setting where parametric working models are assumed. Estimators are applied to data from the Improving Pregnancy Outcomes with Progesterone (IPOP) study to estimate the effect of maternal anemia on birth weight among women with HIV. 

Keywords

augmented inverse probability weighting

causal inference

double robustness

empirical sandwich variance

M-estimation 

Co-Author(s)

Paul Zivich
Chanhwa Lee
Keyi Xue, University of North Carolina at Chapel Hill
Rachael Ross, Department of Epidemiology, Mailman School of Public Health, Columbia University
Jessie Edwards, University of North Carolina-Chapel Hill
Jeffrey Stringer, Department of Obstetrics and Gynecology, University of North Carolina at Chapel Hill
Stephen Cole, University of North Carolina

First Author

Bonnie Shook-Sa, UNC Chapel Hill

Dual Role for Negative Control Outcomes: Improving Validity and Efficiency in Observational Studies

Negative control outcomes (NCOs) are increasingly used in observational studies to detect and correct bias, particularly in settings where unmeasured confounding poses a challenge to causal inference. While previous applications of NCOs have primarily focused on bias correction, their potential to improve the efficiency of treatment effect estimation remains underexplored. In this work, we propose a novel method that leverages NCOs not only to adjust for bias but also to enhance statistical efficiency. Through extensive simulations, we demonstrate that our approach can reduce the standard deviation of the estimated treatment effect up to 60% while maintaining unbiased estimation. To illustrate its practical utility, we apply the method to evaluate the impact of GLP-1 receptor agonists (GLP1RAs) on mental health disorders. Our findings highlight the dual benefits of NCOs in improving both validity and precision in causal effect estimation. 

Keywords

Efficiency Gain

GLP-1 receptor agonists

Negative Control Outcomes

Observational Studies 

Co-Author(s)

Huiyuan Wang, University of Pennsylvania
Dazheng Zhang
Yong Chen, University of Pennsylvania, Perelman School of Medicine

First Author

Yiwen Lu

Presenting Author

Yiwen Lu

Robust and efficient estimation of marginal structural models dependent on partial treatment history

Inverse probability (IP) weighting of marginal structural models (MSMs) can provide consistent estimators of time-varying treatment effects under correct model specifications and identifiability assumptions, even in the presence of time-varying confounding. However, this method has two problems: (i) inefficiency due to IP-weights cumulating all time points and (ii) bias and inefficiency due to the MSM misspecification. To address these problems, we propose new IP-weights for estimating the parameters of the MSM dependent on partial treatment history and closed testing procedures for selecting the MSM under known IP-weights. In simulation studies, our proposed methods outperformed existing methods in terms of both performance in estimating time-varying treatment effects and in selecting the correct MSM. Our proposed methods were also applied to real data of hemodialysis patients with reasonable results. 

Keywords

Closed testing procedure

History-restricted marginal structural models

Model selection/ Variable selection

Inverse probability weighting

Time-varying confounding

Time-varying treatment 

Co-Author(s)

Masataka Taguri, Tokyo Medical University
Takeo Ishii, Yokohama City University

First Author

Nodoka Seya, Tokyo Medical University

Presenting Author

Nodoka Seya, Tokyo Medical University