Detecting AI-Generated Survey Responses: Algorithm Development and Bias Mitigation

Lilian Huang Co-Author
 
Brandon Sepulvado First Author
 
Brandon Sepulvado Presenting Author
 
Thursday, Aug 7: 9:20 AM - 9:35 AM
2561 
Contributed Papers 
Music City Center 
Large language model (LLM)-generated responses to open-ended questions have become increasingly common in online surveys. Unfortunately, this potentially compromises survey data quality and increases the cost of data collection and review. To tackle this challenge, we have developed a machine learning classifier that detects AI-generated responses to open-ended survey questions. This presentation will highlight the ways in which off-the-shelf LLMs do not respond like typical survey respondents and how key differences helped drive feature selection for the classifier. To create training data, we generated responses from LLMs (e.g., GPT, Llama, and Claude) to compare with responses from survey respondents to multiple open-ended questions. Performance is excellent, with precision and recall as high as 99% on held out training data and in the low 90% for unseen observations from subsequent surveys using different types of questions and subject matter domains. We will conclude this presentation with a discussion of bias and equity considerations, noting how performance varies across groups and suggesting equitable approaches to handling responses labeled a potentially from AI.

Keywords

AI

Large language models

Survey data quality

Machine learning

Text analysis

Natural language processing 

Main Sponsor

Survey Research Methods Section