Veridical Data Science Education

Matteo Bonvini Chair
 
Rebecca Barter Panelist
University of Utah
 
Bin Yu Panelist
University of California at Berkeley
 
Andrew Bray Panelist
UC Berkeley
 
Joshua Rosenberg Panelist
University of Tennessee, Knoxville
 
Ruobin Gong Panelist
Rutgers University
 
Ruobin Gong Organizer
Rutgers University
 
Matteo Bonvini Organizer
 
Thursday, Aug 7: 8:30 AM - 10:20 AM
0661 
Topic-Contributed Panel Session 
Music City Center 
Room: CC-104C 
The past decade saw the flourishing of data science at institutions of higher education. Fueled by high demand for data scientists in the industries, data science transformed from the murky composite it once was into the keystone to every modern quantitative education, supported by programs, departments, and schools newly dedicated to the name. Educators from traditional disciplines, statistics included, find themselves teaching to an audience who embrace the identity of data scientists and look to prepare themselves the same. Our students, these next-generation data scientists, will shoulder the charge of making trustworthy scientific discoveries, informed policy decisions, and sound business advice. As educators, what might we do to ready them for the modern reality in which data are abundant yet truth is scarcer? As statisticians, how do we steward the quintessence of statistical principles that we hold dear, while persuading our students to see the broader picture comprehensively and fairly?

As an example of recent efforts to pedagogically harmonize these challenges, Yu and Barter's Veridical Data Science (2024) articulates key aspects of modern data science that are distinguished from classical statistical training. A data science project is construed as a life cycle. Preceding the inferential or predictive analysis that is traditionally regarded as the core of statistical modeling, a Data Science Life Cycle begins with the formulation of the domain scientific problem, followed by data collection, preprocessing, and exploratory analysis. It is further succeeded by the scrutinization of analysis results, the interpretation and the communication of the same to update domain knowledge. The VDS framework underscores critical thinking and the indispensable role of human judgment calls throughout the Data Science Life Cycle, as the basis to extract and communicate useful and trustworthy information from data. The emphasis on predictability, computability and stability of data-based scientific evidence that VDS advocates provides a more comprehensive and realistic portrayal of what is required of a good data scientist in the modern time.

This session features a diverse panel of experienced educators of data science and researchers of data science education to share their perspectives on topics in reflection of the session's central illustrated above. The topics include:

- Principles of Veridical Data Science;
- Modernization of traditional statistical curriculum via the Veridical Data Science framework;
- Pedagogical considerations, and empirical challenges, of modern data science education at the undergraduate, graduate, as well as the K-12 levels;
- Reconciliation of diverse priorities and starting points of data science educators to better inform a set of coalesced guiding principles for the future.

The timely discussion that the panel offers will shed light on visions and future pedagogical strategies to effectively prepare the next generation of data science talents through both traditional and novel means.

Applied

Yes

Main Sponsor

Section on Statistics and Data Science Education

Co Sponsors

Business Analytics/Statistics Education Interest Group
Section on Teaching of Statistics in the Health Sciences