Quantifying uncertainty in marathon finish time predictions

Eric Gerber Speaker
Northeastern University
 
Brandon Onyejekwe Co-Author
Northeastern University
 
Tuesday, Aug 4: 10:35 AM - 11:00 AM
Invited Paper Session 
Thomas M. Menino Convention & Exhibition Center 
During a marathon, the expected finish time of runners is commonly estimated by extrapolating their average pace at that point, assuming it will hold constant for the rest of the race. Two problems arise when predicting finish times this way: the estimates do not consider in-race context that can determine if a runner is likely to finish faster or slower than expected, and the prediction is a simple point estimate with no information about uncertainty. To address these issues, we implement a hierarchical Bayesian linear regression model that incorporates information from all splits in a race and allows quantification of uncertainty around the predicted finish times. Multiple models under this Bayesian framework are compared to the traditional extrapolation method using data from the Boston, New York, and Chicago Marathons over four years (2021-2024), and we find a marked improvement in predictive accuracy. We also develop an app that allows runners to visualize their estimated finish time distribution in real time.

Keywords

Marathon

Bayesian linear regression

Uncertainty quantification

Sports analytics