Latent Confounder Adjustment for High-Dimensional Multivariate Binary Data
Xianming Tan
Co-Author
University of North Carolina at Chapel Hill
Di Hu
Speaker
University of North Carolina at Chapel Hill
Sunday, Aug 3: 5:05 PM - 5:25 PM
Topic-Contributed Paper Session
Music City Center
Vaccine safety surveillance relies on the analysis of high-dimensional binary outcomes, where multiple adverse events are recorded per report, as seen in the Vaccine Adverse Event Reporting System (VAERS). Accurate modeling of these outcomes is crucial for detecting potential vaccine safety signals. However, existing methods face limitations in adjusting for latent confounders while maintaining computational feasibility in high-dimensional settings.
Some approaches, such as Principal Component Analysis (PCA) for binary data (De Leeuw, 2006) and Logistic PCA (Landgraf & Lee, 2020), focus on latent factor extraction but do not provide regression coefficients, making them unsuitable for estimating direct associations between vaccine exposure and adverse events. Other methods, particularly those based on Generalized Linear Latent Variable Models (GLLVMs), such as Penalized Quasi-Likelihood (PQL) (Huber et al., 2004) and Alternating Iteratively Reweighted Least Squares (AIRWLS) (Kidzinski et al., 2022), are designed for high-dimensional binary data and provide scalable estimation techniques. However, while these methods incorporate latent factors to model dependence among responses, they do not explicitly adjust for latent confounders, leading to potential biases when unobserved factors influence both vaccine exposure and adverse events. Additionally, the approximations used in PQL-based methods introduce estimation bias, making the inferred regression coefficients less reliable for assessing causal relationships.
To address these challenges, we propose a computationally efficient method for latent confounder adjustment in high-dimensional multivariate binary regression, specifically designed for vaccine safety applications. Our approach integrates latent confounder adjustment with coefficient estimation, overcoming the combinatorial complexity of binary outcomes while improving estimation accuracy. By leveraging efficient optimization strategies, we enable scalable inference that maintains statistical rigor in large real-world datasets.
Our ongoing work focuses on evaluating the theoretical properties and empirical performance of the proposed method using both simulated and real-world vaccine safety data. Preliminary results suggest that our approach has the potential to enhance vaccine safety surveillance by enabling robust association estimation in high-dimensional settings. Future directions include further scalability improvements, validation on additional real-world datasets, and potential extensions to longitudinal vaccine safety studies.
You have unsaved changes.