Tuesday, Aug 5: 2:00 PM - 3:50 PM
4130
Contributed Papers
Music City Center
Room: CC-106C
Main Sponsor
Section on Statistical Learning and Data Science
Presentations
Causal inference in observational studies with high-dimensional covariates presents significant challenges. We introduce CausalBGM, an AI-powered Bayesian generative modeling approach that captures the causal relationship among covariates, treatment, and outcome variables. The core innovation of CausalBGM lies in its ability to estimate the individual treatment effect (ITE) by learning individual-specific distributions of a low-dimensional latent feature set (e.g., latent confounders) that drives changes in both treatment and outcome. This approach not only effectively mitigates confounding effects but also provides comprehensive uncertainty quantification, offering reliable and interpretable causal effect estimates at the individual level. This framework leverages the power of AI to capture complex dependencies among variables while adhering to the Bayesian principles. Its Bayesian foundation ensures statistical rigor, providing robust and well-calibrated posterior intervals. By addressing key limitations of existing methods, CausalBGM emerges as a robust and promising framework for advancing causal inference in modern applications.
Keywords
Treatment Effect
Bayesian Deep Learning
Markov chain Monte Carlo
Dose-response Function
Potential Outcome
Uncertainty Quantification
The Child Opportunity Index (COI) is a comprehensive measure that captures various dimensions of neighborhood environments, which may impact health outcomes. This study aims to explore the causal relationship between COI and glycemic control measured by A1C levels in individuals diagnosed with type 2 diabetes. Utilizing a robust dataset from patients at Akron Children's Hospital that includes zip code, COI, and various clinical and demographic information, we employ advanced causal inference methods to elucidate this association.
Our analysis focuses on a diverse cohort, examining how variations in COI influence A1C levels over the first year following diagnosis. Preliminary findings suggest that higher COI scores, indicative of more favorable neighborhood conditions, are associated with better A1C outcomes.
This study underscores the importance of considering social determinants of health in managing type 2 diabetes and highlights the potential of COI as a valuable tool for identifying at-risk populations.
Keywords
Causal Infrence
Health Disparities
Social Determinant of Health
Public Data
Child Opportunity Index
Tye 2 Diabetes
A/B testing is the golden standard to measure causal relationship, but it can be operationally infeasible in certain business scenarios. Alternatively, causal inference is critically used in Google Cloud to quantify the business impact of new releases or launches. In practice, the success of an attribution analysis depends on accurately identifying the key variables and representing them in the most suitable forms, which often result in various variable types. In this talk, we consider the case when continuous variables are in presence, more specifically, when the covariates contain continuous variables, and when the response of interest is continuous. We will discuss the main challenges from heavily skewed continuous variables, and a simple solution to incorporate them into a doubly-robust causal framework widely adopted within Google Cloud.
Keywords
Causal inference
Doubly robust
Machine learning
Continuous variables
Skewed variables
Covariate balance
Data from multiple environments offer valuable opportunities to uncover causal relationships among variables. We propose nearly necessary and sufficient conditions for ensuring that the invariant prediction model matches the causal outcome model. Exploiting the essentially necessary identification conditions, we introduce Negative Weight Distributionally Robust Optimization (NegDRO), a nonconvex continuous minimax optimization whose global optimizer recovers the causal outcome model. Unlike standard group DRO problems that maximize over the simplex, NegDRO allows negative weights on environment losses, which break the convexity. Despite its nonconvexity, we demonstrate that a standard gradient method converges to the causal outcome model, and we establish the convergence rate with respect to the sample size and the number of iterations. Unlike the existing causal invariance learning approaches, our algorithm avoids exhaustive search, making it scalable especially when the number of covariates is large.
Keywords
Causal invariance learning
Nonconvex optimization
Computationally efficient causal discovery
Multi-source data
We consider the problem of learning the underlying causal structure among a set of variables, which are assumed to follow a Bayesian network or, more specifically, a linear acyclic structural equation model (SEM) with the associated errors being independent and allowed to be non-Gaussian. A Bayesian hierarchical model is proposed to identify the true data-generating directed acyclic graph (DAG) structure where the nodes and edges represent the variables and the direct causal effects, respectively. Moreover, incorporating the information of non-Gaussian errors, we characterize the distribution equivalence class of the true DAG, which specifies the best possible extent up to which the DAG can be identified based on purely observational data. Furthermore, under the consideration that the errors are distributed as some scale mixture of Gaussian, where the mixing distribution is unspecified, and mild distributional assumptions, we establish that the posterior probability of the distribution equivalence class of the true DAG converges to unity as the sample size grows. This shows that the proposed method achieves the posterior DAG selection consistency. Simulation studies are presented to illustrate the results, where we also demonstrate different rates of divergence of the associated posterior odds varying over the competing DAGs.
Keywords
Causal discovery
Causal inference
Structural equation modeling
Bayesian model selection
Posterior consistency
Causal structure learning
The discovery of causal relationships from observational data is very challenging. Many recent approaches rely on complexity or uncertainty concepts to impose constraints on probability distributions, aiming to identify specific classes of directed acyclic graph (DAG) models. In this paper, we introduce a novel identifiability criterion for DAGs that places constraints on the conditional variances of additive noise models. We demonstrate that this criterion extends and generalizes existing identifiability criteria in the literature that employ (conditional) variances as measures of uncertainty in (conditional) distributions. For linear structural equation models, we present a new algorithm that leverages the concept of weak majorization applied to the diagonal elements of the Cholesky factor of the covariance matrix to learn a topological ordering of variables. Through extensive simulations and the analysis of bank connectivity data, we provide evidence of the effectiveness of our approach in successfully recovering DAGs. The code for reproducing the results in this paper is available in Supplementary Materials.
Keywords
Directed Acyclic Graphs
Identifiability
Majorization
Causal structure learning
Recent years have seen significant methodological advancements in predicting heterogeneous treatment effects (HTE). However, there is a scarcity of methodological approaches for HTEs arising from random effects. Our work addresses the challenges in estimating HTE stemming from random effects. We particularly focus on developing a methodology for estimating HTE when the design matrix forming the random effects is unidentified, a scenario frequently encountered in many practical fields. When cluster distributions are separated by covariates, we demonstrate that random effects can be estimated through tree nodes. We provide theoretical proofs for consistency. The model is validated Using both simulations and real-world data, compared with causal forests. This approach expands the applicability of tree algorithms and enhances the role of random effects in HTE estimation.
Keywords
Heterogeneous treatment effect
Random effect
Classification and regression trees
Model misspecification