Predicting the Final Points of the Decathlon Based on the Results of the First Day Events using Regr

Abdelmonaem Jornaz First Author
Park University
 
Abdelmonaem Jornaz Presenting Author
Park University
 
Sunday, Aug 4: 3:15 PM - 3:20 PM
3831 
Contributed Speed 
Oregon Convention Center 
The decathlon is a complex athletics discipline that combines ten track and field events held over the course of two days for male athletes. These ten events can be classified as "running," "jumping," and "throwing" events. A dataset was gathered from the competition results of all Olympic games and world athletics championships from 1984 to 2023 (n = 595), and it was divided into training (90%) and testing (10%) subsets.
The main objective of this study is to predict the decathlon final points standings using the five events of the first day. The training and test set were resampled with replacement 10000 times of the original dataset, then four regression models were applied to test which model fits the data better, and the root mean square error (RMSE) was used as a model performance criterion. The results showed that the final performance is highly influenced by two events from the first day, which are long jump (LJ) and shot put (SP). In addition, the multiple linear regression model was the best performing model to predict the final results followed by partial least square regression and quantile regression.

Keywords

Decathlon

Multiple linear regression model

Partial least square regression

Quantile regression

Principal component regression model

Root mean square error (RMSE) 

Main Sponsor

Section on Statistics in Sports