Sunday, Aug 3: 4:00 PM - 5:50 PM
0941
Topic-Contributed Paper Session
Music City Center
Room: CC-104B
Large language model
Causal inference and discovery
Machine learning
AI
Applied
Yes
Main Sponsor
Section on Text Analysis
Co Sponsors
Section on Statistical Computing
Presentations
Large language models (LLMs) are increasingly being applied in scientific research due to their advanced reasoning capabilities. For instance, multi-modal LLMs can process diverse data types as inputs, expanding their utility across various domains. However, while traditional causality methods primarily focus on tabular data, existing language models are largely limited to inferring causal relationships from textual data. In this work, we leverage the powerful reasoning capabilities of language models to infer and discover causal relationships directly from tabular data. The proposed framework utilizes the Mamba (State Space Model) language model architecture with added layers for classification tasks. To ensure the framework's robustness and generalizability, we incorporate a diverse range of simulation data and 10 curated real-world datasets into the training procedure. Furthermore, our framework is designed to be extensible, enabling users to easily integrate their data and additional scores and tests. Our results demonstrate that the proposed causal framework outperforms existing methods in terms of accuracy. Additionally, the framework is designed to be extensible, allowing users to incorporate their data for further customization and application.
Keywords
Large language model
Causal inference
Causal discovery
Tubular data
Complex causal systems
Traditional methods for causal graph recovery often rely on statistical estimation or expert input, which can be limited by bias and incomplete knowledge. In this presentation, we introduce an approach that integrates large language models (LLMs) with constraint-based causal discovery to infer causal structures from scientific literature. Our method employs LLMs as knowledge extractors to identify associational relationships among variables from extensive scientific corpora. These relationships are then refined into causal graphs via constraint-based algorithms that eliminate inconsistent connections.
Rather than depending on LLMs for complex causal reasoning, our method leverages their strength in interpreting and extracting information from large-scale scientific texts. This allows us to uncover nuanced associational and causal insights without relying solely on the models' reasoning capabilities. By integrating textual knowledge extraction with causal inference techniques, our method provides a scalable, automated solution for causal discovery, mitigating human bias and harnessing the collective knowledge embedded in scientific discourse.
Keywords
LLMs
Causal discovery and recovery
Causal estimation requires assumptions about the underlying data-generating process. To achieve unbiased estimates, we typically assume no unobserved confounding and adjust for confounders that influence both the treatment and outcome. In application domains such as clinical records where text may supplement structured data, there may be unobserved confounders that can be accounted for using more complex identification and estimation strategies. However, the large language models (LLMs) that demonstrate predictive performance often do not meet the statistical assumptions required for causal estimation. This presentation discusses two ways in which causal estimation methods can be augmented with LLMs to enable unbiased estimation of causal effects, through either Double Machine Learning or a measurement error framework.
Causality is a fundamental notion in science, engineering, and even in machine learning. Uncovering the causal process behind observed data can naturally help answer 'why' and 'how' questions, inform optimal decisions, and achieve adaptive prediction. In many scenarios, observed variables (such as image pixels and questionnaire results) are often reflections of the underlying causal variables rather than being causal variables themselves. Causal representation learning aims to reveal the underlying hidden causal variables and their relations. In this talk, we show how the modularity property of causal systems makes it possible to recover the underlying causal representations from observational data with identifiability guarantees: under appropriate assumptions, the learned representations are consistent with the underlying causal process. We demonstrate how identifiable causal representation learning can naturally benefit generative AI, with image generation, image editing, and text generation as particular examples.
Speaker
Kun Zhang, Carnegie Mellon University & MBZUAI