17. StatWrap: Managing Variation and Change in Support of Reproducible Research

Conference: Conference on Statistical Practice (CSP) 2024
02/27/2024: 5:30 PM - 7:00 PM CST
Posters 

Description

Practicing reproducible research is important, but increasingly complex as studies involve more data and code, and larger teams. Tools like Jupyter Notebook and R Markdown support reproducibility, but are not designed to collect information such as: Who worked on the analyses, and what decisions did they make? Where did the data come from? What are the code file dependencies and code libraries? We developed StatWrap, an open source and free software program, as an assistive, non-invasive discovery and inventory tool to document these changes in a research project. StatWrap combines automatically collected metadata (e.g., statistical packages, code file dependencies), investigator-supplied documentation (e.g., analysis notes, personnel), and source control. StatWrap creates interactive "workflow graphs" illustrating relationships between code, data, and libraries. It helps team members document workflow and analysis decisions. StatWrap creates a searchable project log of user actions in a project – for example, notes associated with data. "Wrapping" information together, StatTag promotes reproducibility by documenting data, code, collaborators, and their changes over time.

Keywords

reproducibility

collaborative biostatistics

research documentation 

Presenting Author

Leah Welty, Northwestern University, Feinberg School of Medicine

First Author

Luke Rasmussen, Northwestern University

CoAuthor(s)

Eric Whitley, Northwestern University
Abigail Baldridge, Northwestern University
Leah Welty, Northwestern University, Feinberg School of Medicine