Revolutionizing Propensity Score Estimation with Machine Learning Techniques

David Haziza Chair
University of Ottawa
 
Cindy Yu Discussant
Iowa State University
 
Sixia Chen Organizer
 
Monday, Aug 4: 2:00 PM - 3:50 PM
0171 
Invited Paper Session 
Music City Center 
Room: CC-104E 

Keywords

Machine Learning

Propensity Score

Survey Data 

Applied

Yes

Main Sponsor

Social Statistics Section

Co Sponsors

Section on Statistical Learning and Data Science
Survey Research Methods Section

Presentations

On the deep neutral network-based nonresponse adjustment for complex survey data

Unit nonresponse is a frequent issue in sample surveys, and naive estimates that do not account for nonrespondents can result in biased outcomes. Common nonresponse adjustment techniques, such as logistic regression and tree-based methods, rely on specific model assumptions that may not hold true, particularly when dealing with highly non-linear and high-dimensional nonresponse mechanisms. In contrast, deep neural network methods have demonstrated effectiveness in managing such complexities. In this paper, we propose the application of deep neural networks for nonresponse adjustment in complex survey data. We compare our approach with established methods, including logistic regression, generalized additive models, and tree-based techniques, through both simulation studies and real-world applications. Our results highlight the advantages of deep neural networks in improving the accuracy of nonresponse adjustments. 

Keywords

Machine learning

Deep learning

Nonresponse

Survey Data 

Co-Author

Sixia Chen

Speaker

Chao Xu, University of Oklahoma Health Sciences Center

Model-Based Weighting for Nonresponse in the American Community Survey: Evaluation and Visualization

​Declining response rates and data collection interruptions are resulting in missing data complexity that traditional missing data techniques used in Census Bureau survey processing may not flexibly capture. At the same time, availability and linkability of administrative records and third party data has improved allowing for more informative response propensity models. We present a study of inverse probability weighting (IPW) to adjust for unit nonresponse using traditional statistical models (non-ML) and machine learning (ML) algorithms adapted for complex survey data. We share various measures for model comparisons and for visualizing geographically-differentiated results. This work presents a case study of the value and advantage of ML and non-ML model-based IPW nonresponse adjustment using auxiliary sources with multiple years of American Community Survey data.

 

Keywords

missing data

nonresponse

survey data

boosting

mapping visualizations 

Speaker

Darcy Steeg Morris, U.S. Census Bureau

On the use of machine learning methods for the treatment of unit nonresponse in surveys

In recent years, there has been a significant interest in machine learning in national statistical ones. Thanks to their flexibility, these methods may prove useful at the nonresponse treatment stage.
After an introduction to statistical learning procedures, we will discuss some of the advantages and challenges associated with their use for the treatment of unit nonresponse. We will discuss the relationship between precise predictions and precise estimation. The results of an extensive simulation study will be presented to illustrate these points. Finally, the problem of selecting or aggregating several statistical learning procedures will be discussed.  

Keywords

Survey sampling

Nonresponse

Propensity score estimation

Aggregation 

Speaker

Mehdi Dagdoug, McGill University