Building Better Data Analyses: Theory, Methods, and Lessons Learned

Abstract Number:

1355 

Submission Type:

Invited Paper Session 

Participants:

Roger Peng (1), Roger Peng (1), Stephanie Hicks (2), Lucy D'Agostino McGowan (3), Shannon Ellis (4), Matthew Vanaman (1)

Institutions:

(1) University of Texas, Austin, N/A, (2) Johns Hopkins University, Bloomberg School of Public Health, N/A, (3) Wake Forest University, N/A, (4) UC San Diego, N/A

Chair:

Roger Peng  
University of Texas, Austin

Session Organizer:

Roger Peng  
University of Texas, Austin

Speaker(s):

Stephanie Hicks  
Johns Hopkins University, Bloomberg School of Public Health
Lucy D'Agostino McGowan  
Wake Forest University
Shannon Ellis  
UC San Diego
Matthew Vanaman  
University of Texas, Austin

Session Description:

Description: Data science has risen rapidly in the collective consciousness of our society and the ability to analyze data well is quickly becoming an essential skill. As a result, it has become urgent that data analysis training and education be scaled broadly. However, a fundamental problem in the practice of data analysis is determining how to formally evaluate the quality of a given data analysis and how to get students and practitioners to do better data analyses. We must move beyond the "know it when we see it" phase of data analysis and build more formal understandings of data analytic quality. This requires building models of various aspects of the analytic process and distilling generalizable lessons from data analytic experience. With such models and information, we can then scale the training of data analysis beyond the often-used apprenticeship model. The session will explore the theoretical, practical, and pedagogical aspects of conducting data analyses and address key areas that can lead to better data analyses and scalable training.

Focus: We highlight four areas of data science activity -- the iterative cycle of analysis, the alignment of the analyst and audience to produce useful analyses, the distillation of generalizable lessons from analytic experience, and the role of reflective practice when comparing expectations to observations in data analysis.

Appeal: For many, including students at both the undergraduate and graduate level, data analysis can appear to be a nebulous and mysterious process. While some eventually learn through experience, many do not, and it is worth asking whether such a process can be accelerated and made more equitable? This session will explore formal mechanisms for understanding data analysis that are analogous to the approaches taken with learning statistical theory and methods.

Timeliness: More people than ever before are analyzing data, whether they know it or not. More students than ever before want to learn data science and get data science jobs. There is a therefore a demand to develop approaches for formally discussion the quality of data analysis and for providing concrete and consistent advice for how to improve data analyses. This session provides some of the foundational ideas upon which such a formal system can be built.

Content:

Stephanie C. Hicks
Department of Biostatistics
Johns Hopkins University
Talk Title: Modeling Analytic Iteration with Probabilistic Outcome Sets

Lucy D'Agostino McGowan
Department of Statistics
Wake Forest University
Talk Title: Evaluating the Alignment of a Data Analysis between Analyst and Audience

Shannon Ellis
Department of Cognitive Science
University of California, San Diego
Talk Title: Lessons Learned from 1,000 Data Science Projects

Matthew Vanaman
Department of Statistics and Data Sciences
University of Texas, Austin
Talk Title: Exploring Reflective Practice in Data Analysis

Sponsors:

Section on Statistical Graphics 2
Section on Statistics and Data Science Education 1
Section on Teaching of Statistics in the Health Sciences 3

Theme: Statistics and Data Science: Informing Policy and Countering Misinformation

Yes

Applied

Yes

Estimated Audience Size

Small (<80)

I have read and understand that JSM participants must abide by the Participant Guidelines.

Yes

I understand and have communicated to my proposed speakers that JSM participants must register and pay the appropriate registration fee by June 1, 2024. The registration fee is nonrefundable.

I understand