Tools for automating project-based data science classes with hundreds of students

Jason Fleischer Speaker
University of California (San Diego)
Thursday, Aug 8: 9:15 AM - 9:35 AM
Topic-Contributed Paper Session 
Oregon Convention Center 
We have developed an introductory data science course that serves between 400 and 700 upper division students every quarter. The course has run almost every quarter since 2017 and is very popular. In spite of the large class sizes it is still regularly among the most waitlisted courses at UC San Diego. A major component of the course is a 7 week long group project where the students define a question for themselves, find/create the data to address the question, analyze, and write a formal report. In this talk I will discuss (1) course aims and topics covered, (2) workflow for serving students at scale, (3) use of autograded exercises, (4) our open source scripts for creating and managing project groups that integrate Canvas LMS and GitHub, and (5) how we address the challenges project groups often face. If there is time I will talk about using our automated tooling to support a pedagogical experiment investigating how group gender composition affects student satisfaction and group dynamics.