09. Correcting Error-Prone Food Access Measures: A Comparative Study of Strategic Validation Sampling Designs

Conference: Women in Statistics and Data Science 2025
11/12/2025: 3:00 PM - 4:00 PM EST
Speed 

Description

Understanding the relationship between neighborhood food environments and obesity outcomes relies heavily on accurately measuring food access. However, proximity-based metrics frequently introduce measurement error and misclassification, leading to bias in regression estimates. This study evaluates five two-phase validation sampling designs to correct bias due to error-prone food access measures when modeling obesity prevalence using Poisson regression.

Using census tract-level data from the Piedmont Triad region of North Carolina, we defined true food access based on driving distances and compared it to an error-prone proxy based on underestimated straight-line distances. We implemented various validation sampling strategies, including simple random sampling, case-control sampling , balanced case-control sampling, extreme tail sampling, and residual sampling based on the "naive" model (using error-prone data). For each design, a validation subset of 48 tracts was selected to have true food access measures; all other tracts had only the error-prone versions. We then analyzed tract-level obesity prevalence using the partially validated data and maximum likelihood estimation, jointly modeling the outcome and error processes. Our findings provide practical guidance for designing efficient validation studies in food environment research and other public health applications, demonstrating that strategically selected validation subsets and MLE correction can substantially enhance inference even under limited validation resources.

Keywords

Poisson regression

Measurement Error

Food Environment

Obesity Prevalence

Social determinants of health 

Presenting Author

Yizhi Zhang

First Author

Yizhi Zhang

CoAuthor

Sarah Lotspeich, Wake Forest University

Target Audience

Beginner

Tracks

Knowledge
Women in Statistics and Data Science 2025