Integrating code testing into the data science workflow to enhance trustworthiness

Rohan Alexander Speaker
 
Sunday, Aug 4: 4:05 PM - 5:30 PM
Invited Paper Session 
Oregon Convention Center 
Code now plays a central role in much of statistical analysis, especially in data science. But few data scientists or statisticians have foundational software engineering skills, and there are complications in data science that are mean some of these skills are not directly transferable in any case. We build one way of integrating testing into a data science workflow, especially focused on statistical modeling, and then show how Large Language Models (LLMs) can be used to automate aspects of this code testing suite. Our workflow and approach enhances the trustworthiness of conclusions from data.