Rule-Based Data Validation and Reconciliation of Survey Responses

Albert Lee Co-Author
Summit Consulting, LLC
 
Gunnar Ingle Speaker
Summit Consulting LLC
 
Wednesday, Aug 7: 9:35 AM - 9:55 AM
Topic-Contributed Paper Session 
Oregon Convention Center 
Each year the US Department of Agriculture's National Agricultural Statistics Service (NASS) conducts more than a hundred surveys to understand and enumerate agriculture in the United States. The quality of survey responses varies with survey and respondent. Ensuring that survey responses are valid, reliable, and internally consistent is vital to publishing accurate official statistics. NASS is undertaking modernization efforts to detect and edit survey responses through rule validation. These innovations include (1) a review and reconciliation of documented (e.g., written in business rules) and undocumented (e.g., only appearing in programming code) validation specifications, (2) distinguishing validation rules whose errors might be correctable with programming code or numeric methods, (3) using numeric methods, such as the Fellegi-Holt algorithm, and R software packages to automate response-level validation checks and error corrections, and (4) flagging instances of automated correction or validation errors for NASS analysts. This paper will describe the processes and procedures used for each step and highlight challenges and solutions to issues commonly encountered.