62: Accounting for Systematic Biases in Transportation Data
Linda Boyle
Co-Author
University of Washington, Industrial & Systems Engineering
Grace Douglas
First Author
New York University C2SMARTER Institute
Grace Douglas
Presenting Author
New York University C2SMARTER Institute
Wednesday, Aug 6: 10:30 AM - 12:20 PM
1826
Contributed Posters
Music City Center
Transportation data on real world events can be quite messy. Models trained on these data often exhibit misclassification patterns impacting inferences made. This is a particular issue in safety research where the models are used for crash prediction. This study presents a framework for identifying and analyzing systematic prediction error. Data related to pedestrian-vehicle crashes at intersections in Seattle, Washington is used to distinguish between locations prone to temporally systematic and spatially random prediction biases. The framework identified significant geographic heterogeneity in model performance and temporally consistent error patterns. A manual labeling protocol using Google Street View showed environmental features (e.g., sight-line obstructions, infrastructure conditions) originally absent from the training data. This analysis reduced manual review requirements by identifying spatial and temporal components contributing to systematic biases observed in naturalistic data. The framework can be used in future crash prediction models to establish protocols for systematic pattern detection and new feature extraction.
crash modeling
misclassification
machine learning
google street view
validation
framework
Main Sponsor
Transportation Statistics Interest Group
You have unsaved changes.