Print Close

Quantifying uncertainty in marathon finish time predictions

Presented During: Sports Analytics and the Boston Marathon

Eric Gerber Speaker
Northeastern University

Brandon Onyejekwe Co-Author
Northeastern University

Tuesday, Aug 4: 10:35 AM - 11:00 AM
Invited Paper Session

Thomas M. Menino Convention & Exhibition Center

During a marathon, the expected finish time of runners is commonly estimated by extrapolating their average pace at that point, assuming it will hold constant for the rest of the race. Two problems arise when predicting finish times this way: the estimates do not consider in-race context that can determine if a runner is likely to finish faster or slower than expected, and the prediction is a simple point estimate with no information about uncertainty. To address these issues, we implement a hierarchical Bayesian linear regression model that incorporates information from all splits in a race and allows quantification of uncertainty around the predicted finish times. Multiple models under this Bayesian framework are compared to the traditional extrapolation method using data from the Boston, New York, and Chicago Marathons over four years (2021-2024), and we find a marked improvement in predictive accuracy. We also develop an app that allows runners to visualize their estimated finish time distribution in real time.

Keywords

Marathon

Bayesian linear regression

Uncertainty quantification

Sports analytics