Variable Selection in Capture-Recapture Models for Adjustment Weight Estimation
Valbona Bejleri
Co-Author
United States Department of Agriculture – National Agricultural Statistics Service
Luca Sartore
Co-Author
National Institute of Statistical Sciences
Wednesday, Aug 5: 10:30 AM - 12:20 PM
Topic-Contributed Paper Session
USDA's National Agricultural Statistics Service (NASS) conducts the Census of Agriculture every five years. Because the Census Mailing List (CML) is incomplete, NASS uses the June Area Survey (JAS) to assess undercoverage. A capture–recapture framework allows for the estimation of weights to adjust for undercoverage, nonresponse, and misclassification. First, the CML and JAS records are linked, then sigmoidal models are fitted to all records. Standard penalized logistic regression may fail to identify the most important covariates, resulting in higher bias and uncertainty of model-based estimates. We introduce a novel penalty structure that enables joint variable selection across multiple models and yields improved adjustment weights and unbiased Census totals. Our approach combines advanced penalties with fractional gradient descent to handle high-dimensional settings, where predictors and interactions exceed ten million elements. Applied to 2022 Census data, it isolates critical predictors, reduces bias, and preserves parsimony, offering a scalable solution for accurate and efficient agricultural statistics.
Bias reduction
Census of Agriculture
Fractional gradient descent
High-dimensional inference
Model parsimony
Regularization methods
You have unsaved changes.