Small Area Model Validation using Data Thinning

Paul Parker Co-Author
University of California Santa Cruz
 
Zehang Li Co-Author
UCSC
 
Ameer Dharamshi Co-Author
 
Sho Kawano Speaker
University of California, Santa Cruz (UCSC)
 
Tuesday, Aug 5: 3:05 PM - 3:25 PM
Topic-Contributed Paper Session 
Music City Center 
Model validation and comparison is a challenge in Small Area Estimation. The primary gauge of a good small area model is the accuracy of its predictors. In many sub-fields where accuracy is the focus, a common practice is sample splitting: dividing their dataset into training and validation subsets. This is not possible in Small Area Estimation since replicate surveys do not exist. However, we show that using data thinning, an approach for splitting an observation into two or more independent parts that sum to the original observation, can allow us to validate small area models with relative ease in a similar manner as sample splitting. We will go over several example applications for validating area-level models.

Keywords

Cross validation

Small Area Estimation

Model Comparison

Data Thinning