CS6b: Invited Session - Cause For Celebration: Adapting Causal Inference Methods For Challenging Datasets

Conference: Women in Statistics and Data Science 2024
10/18/2024: 10:30 AM - 12:00 PM EDT
Panel 
Room: Spruce Oak 

Description

Statistical modeling and estimation of the unknown is ubiquitous in all areas of research, but at times these analyses carry the age-old warning that "correlation does not imply causation." But what if correlation is not enough? Causal inference methods are designed for precisely this purpose. With new advances happening all the time, the causal inference methods themselves are highly visible in a variety of real-world settings. As they are applied broadly, new advances are needed to accommodate the usual data challenges like measurement error, misclassification, or sampling bias in causal inference. Just as its goals are distinct from traditional inference about correlations, the ways in which these challenges are overcome in causal inference are unique. This session, sponsored by the Caucus for Women in Statistics, brings together women in our field to celebrate the new advances they are making in causal inference.

Organizer

Sarah Lotspeich, Wake Forest University

Chair

Rebecca Knowlton, University of Texas at Austin

Target Audience

Mid-Level

Tracks

Knowledge
Women in Statistics and Data Science 2024

Presentations

Causal effect estimation in the presence of misclassified binary mediators

Causal mediation analyses allow researchers to quantify the effect of an exposure variable on an outcome variable through a mediator variable. If a binary mediator variable is misclassified, the resulting analysis can be severely biased. Misclassification is especially difficult to deal with when it is differential and when there are no gold standard labels available. Previous work has addressed this problem using a sensitivity analysis framework or by assuming that misclassification rates are known. We leverage a variable related to the misclassification mechanism to recover unbiased causal effect estimates without using gold standard labels. The proposed methods require the reasonable assumption that the sum of the sensitivity and specificity is greater than 1. An expectation-maximization algorithm is presented to estimate the model and open-source software is provided to implement the proposed methods. We apply our misclassification correction strategies to investigate the mediating role of gestational hypertension on the association between maternal age and preterm birth. 

Speaker

Kimberly Hochstedler

Data integration approaches to estimate heterogeneous treatment effects

Clinicians and practitioners are often motivated to determine which treatment would work best for a given individual based on their observed characteristics, but doing so can be challenging because sample sizes are typically not large enough, and the variables involved in the true treatment effect heterogeneity are often unknown. To better understand treatment effect heterogeneity, researchers can rely on combining information from multiple sources, e.g., multiple randomized controlled trials (RCTs). However, combining data requires taking into account that the data comes from heterogeneous sources with different site-level characteristics that can impact treatment effects. Methods that combine RCTs also often yield treatment effect estimates that are conditional on trial membership, so applying these models to a new target population is not straightforward. This presentation introduces approaches for integrating multiple RCTs to estimate the conditional average treatment effect (CATE) function. We then discuss an approach that extends the CATE model estimated using multiple RCTs to an external target population, drawing from meta-analytic prediction intervals and extended to non-parametric methods. We examine performance in simulations and ultimately apply the approaches to real data comparing major depression treatments to investigate potential effect heterogeneity and estimate effects in a target population of patients in a health care system. 

Speaker

Carly Brantner, Duke University

It's ME hi, I'm the collider it's ME

This talk will focus on framing measurement error as a collider from a causal inference perspective. We will begin by demonstrating how to visually display measurement error in directed acyclic graphs (DAGs). We will then show how these graphs can be used to help communicate when corrections for measurement error are needed and how to implement these corrections in order to estimate unbiased effects. Finally, we will demonstrate how sensitivity analyses traditionally used to address omitted variable bias can be used to quantify the potential impact of measurement error. 

Speaker

Lucy D'Agostino McGowan, Wake Forest University