61: Using AI to Process Paper and Pencil Survey Questionnaire Data

Jennifer Vanicek Co-Author
NORC at The University of Chicago
 
Lilly Grella Co-Author
NORC at the University of Chicago
 
Mehmet Celepkolu Co-Author
NORC at the University of Chicago
 
Kari Carris Co-Author
NORC at The University of Chicago
 
Peyton Holleran First Author
NORC at the University of Chicago
 
Peyton Holleran Presenting Author
NORC at the University of Chicago
 
Wednesday, Aug 6: 10:30 AM - 12:20 PM
1115 
Contributed Posters 
Music City Center 
As technology expands, so do the data entry methods for survey research. Researchers traditionally relied on manual data entry and most recently utilized optical character recognition (OCR) scanning. Manual data entry allows for accurate data, cost and time remain high. In an effort to reduce cost and increase efficiency, NORC moved from manual data entry to OCR scanning for the 2024-25 round of the Reproductive Health Experiences and Access (RHEA) Survey. During 2025, NORC tested AI-based document processing for paper-and-pencil survey responses. We used Azure AI Document Intelligence and Azure OpenAI Large Language Models (LLM) to develop an unsupervised approach to data extraction via Markdown format. We compared the accuracy, efficiency, and costs of the Azure AI approach to those of OCR scanning. This paper will share details about the technical process and provide initial findings related to the time, cost, and accuracy of data entry via these two methods.

Keywords

artificial intelligence

generative AI

optical character recognition

data entry

data entry methods 

Main Sponsor

Survey Research Methods Section