Predicting Quality of a Survey Item from the Question Text
Lydia Repke
Co-Author
GESIS – Leibniz Institute for the Social Sciences
Tuesday, Aug 5: 3:20 PM - 3:35 PM
1280
Contributed Papers
Music City Center
The Survey Quality Predictor (SQP) predicts the quality of survey questions based on 72 question characteristics (e.g. domain, nouns word count, answer scale, length of question). The question characteristics are manually coded. We evaluate whether it is possible to predict the quality of a survey question directly from the natural language text rather than from the 72 survey characteristics. We found that a language model can predict survey item quality directly from the question/answer options text and do so as good as the random forest model based on the 72 manually coded characteristics.
Specifically, we fine-tuned xlm-RoBERTa, a multilingual transformer-based model trained on multiple text corpora in over 100 languages, on our SQP dataset. The current web interface of the survey quality predictor (https://sqp.gesis.org) asks users to manually input the 72 features that users must code themselves based on a coding manual. Our work shows that the current implementation can be replaced with a much more user friendly web interface: the users simply enter the question text (and answer choices), and our natural language model predicts the question quality.
Survey Quality
Language Model
Natural Language Processing
Transformer Model
Random Forest
Deep Learning
Main Sponsor
Section on Statistical Learning and Data Science
You have unsaved changes.