Print Close

Teaching Causal Inference with Observational Data

Jeffrey Witmer Chair
Oberlin College

Milo Schield Organizer
New College of Florida

Wednesday, Aug 6: 8:30 AM - 10:20 AM
0230
Invited Paper Session

Music City Center

Room: CC-207C

Most students are more interested in causal inference than in population inference. Most students are more interested in observational causation than in randomized-experimental causation. Observational causation has been modeled from four different perspectives: Galton's correlation-regression method, Rubin's propensity-imputation method, Pearl's DAG-SEM method, and a Statistical Literacy combination of Wainer's graphical method of controlling for confounding and Cornfield's necessary conditions for nullification. Since Galton's regression method is well known and taught in the multivariate course, this session focuses on teaching observational causation using the last three methods. If we want our students to have a lasting appreciation of statistic, we need to provide a separate introductory course on causal inference that stands along side the traditional course on population inference where together they form a two-semester sequence.

Keywords

observational causation

Applied

Yes

Main Sponsor

Section on Statistics and Data Science Education

Co Sponsors

International Statistical Literacy Project of IASE

Section on Teaching of Statistics in the Health Sciences

Presentations

Comparing potential outcomes and directed acyclic graph approaches to understanding causal inference from data.

Most beginning statistics students are warned that "Correlation is not causation." Yet many of the most important statistical problems involve using data to make causal inferences. This paper shows how to build student understanding of important principles underlying causal inference. The approach involves comparing the potential outcomes approach pioneered by Donald Rubin with the Directed Acyclic Graphs (DAGs) approach pioneered by Judea Pearl. Central to the discussion is the idea of identification. Another key point is understanding the difference between the data collection and measurement process and the statistical modeling process used to perform causal inference.

Keywords

causal inference

Speaker

Christopher Rhoades, university of Conneticut

Refining Students' Statistical Inference, Modeling, Visualization, and Computation Skills through the Lens of Causality

Although undergraduate students know that "correlation doesn't imply causation" and that "confounding variables" pose problems, they usually don't know what does imply causation, or how to diagnose when confounding may be a concern. In this talk, I give guidance on how instructors can improve students' understanding of causality and confounding. Even if your expertise is not in causal inference, I show how you can teach causality to facilitate spiral learning in a statistics curriculum, where students practice many statistical skills (inference, modeling, visualization, computation) through the lens of causality. For example, students can establish many important theoretical results for causality with expectations, variances, and covariances; these results in turn motivate modeling tasks, perhaps with linear regression, logistic regression, or more advanced machine learning. A core idea is that measuring causal effects boils down to "apples-to-apples" comparisons, which can be diagnosed with visualizations and created by computationally matching similar treatment and control subjects, which is intuitive for students. These insights are based on courses I've developed at Carnegie Mellon University, including a sophomore-level undergraduate class on causality and education programs for working professionals outside of statistics. Thus, attendees of this talk will walk away with a set of tools to teach causality to improve undergraduate statistics skills, either in a standalone class or a couple of lessons.

Keywords

Causal inference

confounding

Speaker

Zach Branson, Carnegie Mellon University

Teaching Causal Diagrams in an Introductory Statistics Course and Beyond

There's no shortage of material in introductory statistics courses, and the choice of topics is often out of a teacher's control, determined by tradition and institutional expectations. So why consider adding one more? Tools from Causal Inference offer an alluring promise: the ability to estimate cause-and-effect relationships from observational data—a skill increasingly valuable in both academic and commercial settings.

There are multiple frameworks within Causal Inference, and even experts disagree on which is best. Rather than advocating for a single approach, I focus on core concepts that provide a flexible foundation—one that prepares students to engage with the advanced frameworks they may encounter later. I'll share my experience incorporating Causal Diagrams—Directed Acyclic Graphs, as formalized by Judea Pearl—into an introductory statistics course. This visual tool helps students distinguish between types of correlation while emphasizing that conclusions depend on assumptions as well as data.

In our AI-driven era, a foundation in causal reasoning is critical for understanding the limits of predictive models. The proposed module fits within one week (3 lecture hours, 6 homework hours) and can be integrated into existing sections on observational studies or descriptive statistics. It requires minimal technical background and lays the groundwork for further study in either traditional statistical methods or AI-driven causal discovery. Participants will come away with practical strategies for introducing causal thinking in a way that complements traditional topics and equips students with tools essential for the future.

Keywords

Causal inference

Speaker

Rosanna Overholser, Cal Poly Humboldt

Teaching Observational Causation using the Wainer-Cornfield Approach

Most students are more interested in causal inference than in population inference. Most students are more interested in observational causation than in randomized-experimental causation. They want to use observational statistics as evidence for causal connections. This paper summarizes the basic quantitative needs of today's students and shows how a confounder-based Statistical Literacy course addresses those needs. This course introduces Wainer's graphical approach to controlling for confounding of a two-group comparison of ratios, and introduces Cornfield's necessary conditions for a measured confounder to nullify or reverse an observationally-based comparison of ratios. . The goal is not so much to prove or infer causation as to sensitize students on how a crude comparison of ratios can be influenced by a confounder and allow them to work problems without needing software or Algebra. Using a textbook published by Kendall-Hunt, this Statistical Literacy course is taught at the University of New Mexico (UNM) and New College of Florida (NCF). At UNM it satisfies a mathematics requirement in their General Education curriculum and is required by students majoring in statistics . At UNM a new half-semester version is being required of all incoming students. Students value this course. About half of those taking this course at UNM and NCF agree that confounder-based Statistical Literacy should be required by all college students for graduation.

Keywords

causal inference

social epidemiology

Speaker

Milo Schield, New College of Florida