Using LLM predictions in unbiased causal estimation with unobserved variables

Zach Wood-Doughty Speaker
Northwestern University
 
Sunday, Aug 3: 4:45 PM - 5:05 PM
Topic-Contributed Paper Session 
Music City Center 

Description

Causal estimation requires assumptions about the underlying data-generating process. To achieve unbiased estimates, we typically assume no unobserved confounding and adjust for confounders that influence both the treatment and outcome. In application domains such as clinical records where text may supplement structured data, there may be unobserved confounders that can be accounted for using more complex identification and estimation strategies. However, the large language models (LLMs) that demonstrate predictive performance often do not meet the statistical assumptions required for causal estimation. This presentation discusses two ways in which causal estimation methods can be augmented with LLMs to enable unbiased estimation of causal effects, through either Double Machine Learning or a measurement error framework.