Variable Selection in Capture-Recapture Models for Adjustment Weight Estimation

Habtamu Benecha Speaker
 
Habtamu Benecha Co-Author
 
Justin van Wart Co-Author
USDA NASS
 
Valbona Bejleri Co-Author
United States Department of Agriculture – National Agricultural Statistics Service
 
Luca Sartore Co-Author
National Institute of Statistical Sciences
 
Wednesday, Aug 5: 10:30 AM - 12:20 PM
Topic-Contributed Paper Session 
USDA's National Agricultural Statistics Service (NASS) conducts the Census of Agriculture every five years. Because the Census Mailing List (CML) is incomplete, NASS uses the June Area Survey (JAS) to assess undercoverage. A capture–recapture framework allows for the estimation of weights to adjust for undercoverage, nonresponse, and misclassification. First, the CML and JAS records are linked, then sigmoidal models are fitted to all records. Standard penalized logistic regression may fail to identify the most important covariates, resulting in higher bias and uncertainty of model-based estimates. We introduce a novel penalty structure that enables joint variable selection across multiple models and yields improved adjustment weights and unbiased Census totals. Our approach combines advanced penalties with fractional gradient descent to handle high-dimensional settings, where predictors and interactions exceed ten million elements. Applied to 2022 Census data, it isolates critical predictors, reduces bias, and preserves parsimony, offering a scalable solution for accurate and efficient agricultural statistics.

Keywords

Bias reduction

Census of Agriculture

Fractional gradient descent

High-dimensional inference

Model parsimony

Regularization methods