Tuesday, Aug 6: 11:35 AM - 11:50 AM
3751
Contributed Papers
Oregon Convention Center
Probability sampling has served as the major approach for finite population inference for decades. In the era of big data, nonprobability samples become popular for their feasibility and cost-effectiveness. However, without a known inclusion mechanism, nonprobability samples fail to represent the target population unless appropriate adjustments are made. To leverage the strengths of both sources, we develop a data integration method of probability and nonprobability samples when the variable of interest is observed in both samples. The proposed optimal estimator exhibits efficiency over estimators from either sample. The method also accommodates informative selection of the nonprobability sample and ignorable nonresponse within the probability sample. We implement the method to analyze blood pressure data of US children and adolescents from the National Health and Nutrition Examination Survey (NHANES) and well-child visits throughout the Geisinger Health System. Replication method is used in variance estimation to account for the complex probability survey design of NHANES.
Nonprobability sample
Probability sample
Informative sampling
Missing at random
Variance estimation
NHANES
Main Sponsor
Survey Research Methods Section