Thursday, Aug 8: 8:30 AM - 10:20 AM
1681
Topic-Contributed Paper Session
Oregon Convention Center
Room: CC-G131
Various data technologies and automated approaches can assist with assessment, but care is needed to develop tools and practices that value and support the human learning experience, at the same time as optimising for efficiency and accuracy. Tools are also needed that support teachers to make consistent and valid judgments across very large quantities of responses. This session will present talks related to research and practice within the emerging area of large scale automated-assisted assessment. Primarily using the teaching context of introductory statistics and data science courses, each speaker will discuss their development of different innovative assessment practices and the opportunities, challenges, and rewards of using automation with respect to teaching and research.
Applied
Yes
Main Sponsor
Section on Statistics and Data Science Education
Co Sponsors
Caucus for Women in Statistics
Presentations
It is vital that introductory-level statistics and data science students learn how to identify and produce short written communications that are statistically and computationally sound. However, there are pedagogical and practical challenges to designing and implementing effective formative assessment of student writing when courses involve hundreds or thousands of students. Scalable methods of support are needed to help students not only produce high quality writing, but also understand the statistical and computational concepts that necessitate the careful use of language. This talk will present our pedagogical and technological explorations for supporting "real time" formative assessment of writing within large introductory statistics lectures. Examples of the tasks and technology that were used will be examined from the perspectives of student engagement and conceptual development.
This talk seeks to articulate the benefit of free-response tasks and timely formative assessment feedback, a roadmap for developing human-in-the-loop natural language processing (NLP) assisted feedback, and results from a pilot study establishing proof of principle. If we are to pursue Statistics and Data Science Education across disciplines, we will surely encounter both opportunity and necessity to develop scalable solutions for pedagogical best practices. Research suggests "write-to-learn" tasks improve learning outcomes, yet constructed-response methods of formative assessment become unwieldy when class sizes grow large. In the pilot study, several short-answer tasks completed by nearly 2000 introductory tertiary statistics students were evaluated by human raters and an NLP algorithm. After briefly describing the tasks, the student contexts, the algorithm and the raters, this talk discusses the results which indicate substantial inter-rater agreement and group consensus. The talk will conclude with recent developments building upon this pilot, as well as implications for teaching and future research.
We have developed an introductory data science course that serves between 400 and 700 upper division students every quarter. The course has run almost every quarter since 2017 and is very popular. In spite of the large class sizes it is still regularly among the most waitlisted courses at UC San Diego. A major component of the course is a 7 week long group project where the students define a question for themselves, find/create the data to address the question, analyze, and write a formal report. In this talk I will discuss (1) course aims and topics covered, (2) workflow for serving students at scale, (3) use of autograded exercises, (4) our open source scripts for creating and managing project groups that integrate Canvas LMS and GitHub, and (5) how we address the challenges project groups often face. If there is time I will talk about using our automated tooling to support a pedagogical experiment investigating how group gender composition affects student satisfaction and group dynamics.
Assessing statistical programming skills consistently and at scale is challenging. Much like writing style is assessed in essay tasks, discriminating code quality and style from code function or output is becoming increasingly important as students adopt code-generating tools such as ChatGPT. In many cases checking code output alone is insufficient to assess students' understanding and ability to write statistical code. Instead, instructors often need to check the code itself for evidence of computational thinking, such as the use of appropriate functions, data structures, and comments. Unfortunately, manual review of code is time-consuming and subjective, and the skills needed to automate this process are complex to learn and use. In this talk, we introduce a new approach to authoring self-paced interactive modules for learning statistics with R. It is built using Quarto and WebR, leveraging literate programming to quickly create exercises and automate assessments. We discuss how this format can be used to write assessments with automated checking of multi-choice quizzes, code input and outputs, and the advantages of in-browser execution via WebR compared to existing server-based solutions.