Survey data integration with applications to hypertension among US children and adolescents

Emily Berg Co-Author
Iowa State University
 
Zhengyuan Zhu Co-Author
Iowa State University
 
Chengpeng Zeng First Author
 
Chengpeng Zeng Presenting Author
 
Tuesday, Aug 6: 11:35 AM - 11:50 AM
3751 
Contributed Papers 
Oregon Convention Center 
Probability sampling has served as the major approach for finite population inference for decades. In the era of big data, nonprobability samples become popular for their feasibility and cost-effectiveness. However, without a known inclusion mechanism, nonprobability samples fail to represent the target population unless appropriate adjustments are made. To leverage the strengths of both sources, we develop a data integration method of probability and nonprobability samples when the variable of interest is observed in both samples. The proposed optimal estimator exhibits efficiency over estimators from either sample. The method also accommodates informative selection of the nonprobability sample and ignorable nonresponse within the probability sample. We implement the method to analyze blood pressure data of US children and adolescents from the National Health and Nutrition Examination Survey (NHANES) and well-child visits throughout the Geisinger Health System. Replication method is used in variance estimation to account for the complex probability survey design of NHANES.

Keywords

Nonprobability sample

Probability sample

Informative sampling

Missing at random

Variance estimation

NHANES 

Main Sponsor

Survey Research Methods Section