The Use of Big Data-Based Model Prediction for Stratification of Household Addresses
Abstract Number:
2919
Submission Type:
Contributed Abstract
Contributed Abstract Type:
Paper
Participants:
Noah Bassel (1)
Institutions:
(1) N/A, N/A
First Author:
Presenting Author:
Abstract Text:
The National Survey of Early Care and Education (NSECE) is the most comprehensive study of the availability and use of early care and education (ECE) in the U.S. Bec ause the target population of the NSECE's household survey is a relatively small proportion of all households, the cost of screening households to determine eligibility has always been an important constraint for the NSECE. Like many household surveys the NSECE also faces the twin challenges of declining response rates and rising data collection costs. In response the 2024 NSECE incorporates big data classification and disproportionate stratification into its frame construction and sampling design. Household commercial data are used as inputs for a machine learning model that predicts the probability that a given household on the frame falls within the target population. Household addresses are then stratified accordingly and households with a high probability of eligibility are oversampled. In this study we will evaluate the tradeoff between cost savings and survey precision and compare realized eligibility rates during data collection to their predicted equivalents at the design stage.
Keywords:
Big Data|Machine Learning|Stratification|Sample Design| |
Sponsors:
Survey Research Methods Section
Tracks:
Sample Design
Can this be considered for alternate subtype?
Yes
Are you interested in volunteering to serve as a session chair?
Yes
I have read and understand that JSM participants must abide by the Participant Guidelines.
Yes
I understand that JSM participants must register and pay the appropriate registration fee by June 1, 2024. The registration fee is non-refundable.
I understand
You have unsaved changes.