Innovations in Survey Methodology

Jeramiah Yeksavich Chair
US Department of Energy
 
Wednesday, Aug 7: 10:30 AM - 12:20 PM
5158 
Contributed Papers 
Oregon Convention Center 
Room: CC-E148 

Main Sponsor

Survey Research Methods Section

Presentations

Automating Quality Control in Recorded Interviews with Machine Learning

Interviewer-administered surveys can suffer from quality issues when question wording differs from the questionnaire. Manual review is required to identify discrepancies and ensure survey quality. RTI QUINTET, a machine learning tool suite, automates quality checks by comparing AI-automated transcripts to the questionnaire. Discrepancies between interviewer administration and the questionnaire are identified, and potentially problematic cases are prioritized for human review. This enhances data quality by identifying re-training opportunities and problematic questionnaire items. We evaluated QUINTET on a telephone healthcare survey with 923 recorded interviews. We compared a random subset of 21 cases that were manually transcribed to transcripts generated by QUINTET, assuming the manual transcripts as ground truth. Preliminary results indicate 90% accuracy for QUINTET. We explore reasons for differences between human and automated transcripts, suggesting future improvements. We also transcribed all interviews to calculate similarity between transcripts to the questionnaire, manually validating low similarity cases. We conclude with discussion of implications for surveys. 

Keywords

Machine Learning

CATI

CARI

Automated Transcription

Survey Administration

Automated Quality Control 

View Abstract 2930

Co-Author(s)

Kirsty Weitzel
Jerry Timbrook

First Author

Peter Baumgartner, RTI International

Presenting Author

Kirsty Weitzel

Evaluating the Impact of Three Incentive Schemes on Survey Responses in Online Longitudinal Panels

I will present the results of an online experiment that evaluated the impact of three incentive payment plans on survey responses in online panels. This study involved 500 online panelists from the University of Michigan (U-M) Masters' student population. Over the course of 6 months starting in October 2023, they were asked to complete a 10-minute wellbeing survey every two and a half months, for a total of three waves. The participants were randomly assigned to one of three treatment groups: the control group received a $5.00 cash incentive for each completed survey, mirroring typical longitudinal study incentives. The two treated groups either received a $5.00 cash incentive one week before each survey wave or a one-time upfront lump sum of $15.00 unconditionally, irrespective of their actual survey participation. Initial results show that response rates among the treated groups are significantly higher than the control group across all survey waves. The treatment effects are robust to the inclusion of covariates. The talk will focus on the underlying theories and mechanisms that drive the results, and discuss their implications for longitudinal data collection in online panels. 

Keywords

Longitudinal data collections

Probability-based online panels

Panel attrition

Survey nonresponse

Survey incentives

Student mental well-being 

View Abstract 2171

First Author

Htay-Wah Saw

Presenting Author

Htay-Wah Saw

Evaluating the Measurement of Household Expectations with Audio Recordings and Machine Learning

In survey methodology, general compliance with protocols and individual interviewer performance has been analyzed with audio recordings. This is a resource intensive task since audios listening must be performed. On the other hand, little work has been done in analyzing subjective probabilistic expectations questions. In economics, agents form expectations for unknown quantities to take decisions, and very often the research problem is to infer the subjective probability distributions that express such expectations. In this paper, we develop a state-of-the-art audio transcription and speaker diarization machine learning pipeline and apply it to audio recordings of a subjective probabilistic expectations question from the Spanish Survey of Household Finances. We first compare the variables from the pipeline with a question evaluation sheet completed by the survey team. Then, we evaluate the interviewer question reading behavior using novel natural language processing techniques. We find that the extracted audio features are useful for assessing compliance, interviewer performance and for detecting biased responses from interviewer-induced household probabilistic expectations. 

Keywords

Machine Learning

Audio Transcription

Survey Methodology

Household Expectations 

View Abstract 2899

Co-Author(s)

Javier J. Alonso, Bank of Spain
Laura Crespo

First Author

Nicolás Forteza, Bank of Spain

Presenting Author

Nicolás Forteza, Bank of Spain

Improving Sexual Identity Measures in Health Disparity Studies with Machine Learning and Resampling

Survey research on sexual identity often categorizes respondents as heterosexual, homosexual, and bisexual, but may miss nuanced identities. Prior work has shown that introducing a "something else" response option can affect health disparity estimates. However, many surveys lack this option. We propose a machine learning approach to infer "something else" responses in existing surveys without this option. Leveraging a split-ballot experiment in the 2015-2019 National Survey of Family Growth, we use the half-sample including "something else" as a training dataset and a set of supervised machine learning algorithms to develop a classifier for sexual identity. We then use the half-sample excluding "something else" as a test dataset, predicting responses on the four-category version of sexual identity and computing revised estimates of disparities based on these new predictions. We repeat this process using bootstrap resampling to generate an empirical distribution of revised disparity estimates, comparing the estimates to those based on the original half-sample used for training. We conclude with implications of this work for future surveys measuring sexual identity. 

Keywords

Sexual Identity Measurement

Machine Learning

Health Disparity Estimates

Survey Research

National Survey of Family Growth (NSFG)

Bootstrap Resampling 

View Abstract 2296

Co-Author

Brady West, Institute for Social Research

First Author

Rona Hu

Presenting Author

Rona Hu

Interviewer Morale, Field Effort and Field Efficiency in the National Health Interview Survey

To examine the association of interviewer morale with field effort and efficiency, the National Health Interview Survey (NHIS) conducted an evaluation of a September to December of 2023 NHIS interviewer support initiative. NHIS, the nation's gold standard nationally representative household health survey, is conducted by the National Center for Health Statistics, with data collected by U.S. Census Bureau Field Representatives (FRs). After describing the initiative, which facilitated peer and supervisor encouragement and instrumental support for NHIS FRs in completing their NHIS cases, this paper presents the methods and results of the evaluation. This evaluation consisted of a 2024 NHIS FR survey on FR perspectives on the initiative's benefits, and an analysis of 2022-2024 NHIS paradata examining the difference in differences across years in the mean number of days to first contact; and mean number of in-person, phone, and total contact attempts to first contact, per case and per completed case. Differences in these measures between August 2023 and January 2024 (before and after the initiative), are compared with differences in these measures between August 2022 and January 2023. 

Keywords

Interviewer support

Interviewer morale

CAPI survey

Field effort

Field efficiency 

View Abstract 2277

Co-Author(s)

Galila Haile, National Center for Health Statistics
Beth Taylor, NCHS(CDC)
Grace Medley, NCHS/ CDC
Maria Villarroel, NCHS(CDC)
Antonia Warren, NCHS(CDC)
Jonaki Bose, NCHS
Lindsay Howden, U.S. Census Bureau
Aaron Maitland, National Center for Health Statistics
Lillian Hoffmann, Census
James Dahlhamer, National Center for Health Statistics

First Author

Adena Galinsky

Presenting Author

Adena Galinsky

Leveraging Wearables Data to Improve Self-Reports in Survey Research: An Imputation-Based Approach

The integration of wearable sensor data in survey research has the potential to mitigate the recall and response errors that are typical in self-report data. However, such studies are often constrained in scale by implementation challenges and associated costs. This study used NHANES data, which includes both self-report responses and wearable sensor data measuring physical activity, to multiply impute sensor values for NHIS, a larger survey relying solely on interviews. Imputations were performed on synthetic populations to fully account for the complex sample design features.

Cross-validation demonstrated the robust predictive performance of the imputation model. The results showed disparities between sensor estimates and survey self-reports, and these discrepancies vary by different subgroups. Imputed estimates in NHIS closely mirrored the observed values in NHANES but tended to have higher standard errors. After the imputation, self-reports and sensor data in the combined dataset were used to predict health conditions as a means for evaluating data quality. Models with sensor values showed smaller deviance and higher coefficients of determination. The study advanced the existing literature on combining multiple data sources and provided insights into the use of sensor data in survey research. 

Keywords

missing data imputation

wearable sensor data

self-report survey

data integration

NHANES

NHIS 

View Abstract 1960

Co-Author

Brady West, Institute for Social Research

First Author

Deji Suolang, University of Michigan - Ann Arbor

Presenting Author

Deji Suolang, University of Michigan - Ann Arbor

Testing Whether Text and Email Contacts Improve Response in a Large ABS Mixed-Mode Study

Multiple modes of contact can increase participation over using a single mode. Text messaging has emerged as a new contact mode; however, it's unclear how to best combine texting with mail and email contacts and what effects these strategies have on response and data quality. To explore the impact of text and email, we designed experiments that varied the number and sequencing of text and email contacts. These were implemented in two waves of the National Survey of Fishing, Hunting, and Wildlife-Associated Recreation, a nationally representative, longitudinal study.
We experimented with the use of text and email invitations and reminders, and the number of reminders sent by different modes. The first study compared text reminders early vs. later in the field period and the impact of a text invitation. The second study explored the use of text and email invitations and the use of multiple text reminders. We also explored the impact of email invitations based on whether the email was provided only for contact or for prior survey incentive payment. In the paper, we examine the effects of the experiments on completion rates, response time, sample representation, and item nonresponse. 

Keywords

mixed-mode

contact strategies

text messaging

text reminders

response rates

text invitations 

View Abstract 3534

Co-Author(s)

Leah Christian, NORC
Zoe Slowinski, NORC
Christopher Hansen, NORC

First Author

Martha McRoy, NORC at the University of Chicago

Presenting Author

Martha McRoy, NORC at the University of Chicago