Methods for Causal Discovery and Causal Inference

Salvador Balkus Chair
Harvard T.H. Chan School of Public Health
 
Tuesday, Aug 5: 2:00 PM - 3:50 PM
4130 
Contributed Papers 
Music City Center 
Room: CC-106C 

Main Sponsor

Section on Statistical Learning and Data Science

Presentations

An AI-powered Bayesian generative modeling approach for causal inference in observational studies

Causal inference in observational studies with high-dimensional covariates presents significant challenges. We introduce CausalBGM, an AI-powered Bayesian generative modeling approach that captures the causal relationship among covariates, treatment, and outcome variables. The core innovation of CausalBGM lies in its ability to estimate the individual treatment effect (ITE) by learning individual-specific distributions of a low-dimensional latent feature set (e.g., latent confounders) that drives changes in both treatment and outcome. This approach not only effectively mitigates confounding effects but also provides comprehensive uncertainty quantification, offering reliable and interpretable causal effect estimates at the individual level. This framework leverages the power of AI to capture complex dependencies among variables while adhering to the Bayesian principles. Its Bayesian foundation ensures statistical rigor, providing robust and well-calibrated posterior intervals. By addressing key limitations of existing methods, CausalBGM emerges as a robust and promising framework for advancing causal inference in modern applications. 

Keywords

Treatment Effect

Bayesian Deep Learning

Markov chain Monte Carlo

Dose-response Function

Potential Outcome

Uncertainty Quantification 

Co-Author

Wing-Hung Wong, Stanford University

First Author

Qiao Liu, Stanford University

Presenting Author

Qiao Liu, Stanford University

Causal Association between Child Opportunity Index (COI) and A1C levels in Type II Diabetes Patients

The Child Opportunity Index (COI) is a comprehensive measure that captures various dimensions of neighborhood environments, which may impact health outcomes. This study aims to explore the causal relationship between COI and glycemic control measured by A1C levels in individuals diagnosed with type 2 diabetes. Utilizing a robust dataset from patients at Akron Children's Hospital that includes zip code, COI, and various clinical and demographic information, we employ advanced causal inference methods to elucidate this association.
Our analysis focuses on a diverse cohort, examining how variations in COI influence A1C levels over the first year following diagnosis. Preliminary findings suggest that higher COI scores, indicative of more favorable neighborhood conditions, are associated with better A1C outcomes.
This study underscores the importance of considering social determinants of health in managing type 2 diabetes and highlights the potential of COI as a valuable tool for identifying at-risk populations. 

Keywords

Causal Infrence

Health Disparities

Social Determinant of Health

Public Data

Child Opportunity Index

Tye 2 Diabetes 

Co-Author(s)

Lisa Shauver, Akron Children's Hospital
Kevin Stoll, Welltower
Michael Forbes, Akron Children Hospital
Michael Oravec, Akron Children Hospital
Jonathan Pelletier, Akron Children Hospital
Ryan Heksch, Akron Children Hospital

First Author

Sima Sharghi, Akron Children's Hospital

Presenting Author

Sima Sharghi, Akron Children's Hospital

Causal inference with heavily skewed continuous variables in Google Cloud

A/B testing is the golden standard to measure causal relationship, but it can be operationally infeasible in certain business scenarios. Alternatively, causal inference is critically used in Google Cloud to quantify the business impact of new releases or launches. In practice, the success of an attribution analysis depends on accurately identifying the key variables and representing them in the most suitable forms, which often result in various variable types. In this talk, we consider the case when continuous variables are in presence, more specifically, when the covariates contain continuous variables, and when the response of interest is continuous. We will discuss the main challenges from heavily skewed continuous variables, and a simple solution to incorporate them into a doubly-robust causal framework widely adopted within Google Cloud. 

Keywords

Causal inference

Doubly robust

Machine learning

Continuous variables

Skewed variables

Covariate balance 

Co-Author

Tianhong He

First Author

Xueqi Zhao

Presenting Author

Xueqi Zhao

Causal Invariance Learning via Efficient Optimization of a Nonconvex Objective

Data from multiple environments offer valuable opportunities to uncover causal relationships among variables. We propose nearly necessary and sufficient conditions for ensuring that the invariant prediction model matches the causal outcome model. Exploiting the essentially necessary identification conditions, we introduce Negative Weight Distributionally Robust Optimization (NegDRO), a nonconvex continuous minimax optimization whose global optimizer recovers the causal outcome model. Unlike standard group DRO problems that maximize over the simplex, NegDRO allows negative weights on environment losses, which break the convexity. Despite its nonconvexity, we demonstrate that a standard gradient method converges to the causal outcome model, and we establish the convergence rate with respect to the sample size and the number of iterations. Unlike the existing causal invariance learning approaches, our algorithm avoids exhaustive search, making it scalable especially when the number of covariates is large. 

Keywords

Causal invariance learning

Nonconvex optimization

Computationally efficient causal discovery

Multi-source data 

Co-Author(s)

Yifan Hu, College of Management of Technology, EPFL
Peter Bühlmann, ETH Zurich
Zijian Guo, Rutgers University

First Author

Zhenyu Wang, Rutgers University

Presenting Author

Zhenyu Wang, Rutgers University

Consistent DAG selection for Bayesian Causal Discovery under general error distributions

We consider the problem of learning the underlying causal structure among a set of variables, which are assumed to follow a Bayesian network or, more specifically, a linear acyclic structural equation model (SEM) with the associated errors being independent and allowed to be non-Gaussian. A Bayesian hierarchical model is proposed to identify the true data-generating directed acyclic graph (DAG) structure where the nodes and edges represent the variables and the direct causal effects, respectively. Moreover, incorporating the information of non-Gaussian errors, we characterize the distribution equivalence class of the true DAG, which specifies the best possible extent up to which the DAG can be identified based on purely observational data. Furthermore, under the consideration that the errors are distributed as some scale mixture of Gaussian, where the mixing distribution is unspecified, and mild distributional assumptions, we establish that the posterior probability of the distribution equivalence class of the true DAG converges to unity as the sample size grows. This shows that the proposed method achieves the posterior DAG selection consistency. Simulation studies are presented to illustrate the results, where we also demonstrate different rates of divergence of the associated posterior odds varying over the competing DAGs. 

Keywords

Causal discovery

Causal inference

Structural equation modeling

Bayesian model selection

Posterior consistency

Causal structure learning 

Co-Author(s)

Anirban Bhattacharya, Texas A&M University
Yang Ni, Texas A&M University

First Author

Anamitra Chaudhuri, Department of Statistics, Texas A&M University

Presenting Author

Anamitra Chaudhuri, Department of Statistics, Texas A&M University

Generalized Criterion for Identifiability of Additive Noise Models Using Majorization

The discovery of causal relationships from observational data is very challenging. Many recent approaches rely on complexity or uncertainty concepts to impose constraints on probability distributions, aiming to identify specific classes of directed acyclic graph (DAG) models. In this paper, we introduce a novel identifiability criterion for DAGs that places constraints on the conditional variances of additive noise models. We demonstrate that this criterion extends and generalizes existing identifiability criteria in the literature that employ (conditional) variances as measures of uncertainty in (conditional) distributions. For linear structural equation models, we present a new algorithm that leverages the concept of weak majorization applied to the diagonal elements of the Cholesky factor of the covariance matrix to learn a topological ordering of variables. Through extensive simulations and the analysis of bank connectivity data, we provide evidence of the effectiveness of our approach in successfully recovering DAGs. The code for reproducing the results in this paper is available in Supplementary Materials. 

Keywords

Directed Acyclic Graphs

Identifiability

Majorization

Causal structure learning 

Co-Author

Yang Ni, Texas A&M University

First Author

Aramayis Dallakyan

Presenting Author

Aramayis Dallakyan

The Recursive Partitioning BLUP (RP-BLUP) for Improved Estimation of Heterogeneous Treatment Effects

Recent years have seen significant methodological advancements in predicting heterogeneous treatment effects (HTE). However, there is a scarcity of methodological approaches for HTEs arising from random effects. Our work addresses the challenges in estimating HTE stemming from random effects. We particularly focus on developing a methodology for estimating HTE when the design matrix forming the random effects is unidentified, a scenario frequently encountered in many practical fields. When cluster distributions are separated by covariates, we demonstrate that random effects can be estimated through tree nodes. We provide theoretical proofs for consistency. The model is validated Using both simulations and real-world data, compared with causal forests. This approach expands the applicability of tree algorithms and enhances the role of random effects in HTE estimation. 

Keywords

Heterogeneous treatment effect

Random effect

Classification and regression trees

Model misspecification 

Co-Author(s)

J. Sunil Rao
Jiming Jiang, University of California, Davis

First Author

Eunsan Kim, University of Minnesota

Presenting Author

Eunsan Kim, University of Minnesota