007: Evaluation of Missing Data Imputation Techniques in Univariate Time Series Data Forecasting with ARIMA

Conference: Conference on Statistical Practice (CSP) 2023
02/03/2023: 7:30 AM - 8:45 AM PST
Posters 
Room: Cyril Magnin Foyer 

Description

In this work we evaluated the predictive performance of autoregressive integrated moving average (ARIMA) model on imputed time-series data using Kalman with ARIMA filtering, Kalman filtering with structural time series, Exponentially weighted moving average, simple moving average, mean imputation, linear interpolation, stine interpolation, and KNN imputation techniques under missing completely at random (MCAR) mechanism. Missing values were generated artificially at 10%, 15%, 25%, and 35% rate using complete data of 24-hours ambulatory blood pressure readings. The performance of ARIMA models were compared on imputed and original data using mean absolute percentage error (MAPE) and root mean square error (RMSE). Based on the results, mean imputation was the best technique, resulting with the smallest MAPE and RMSE at 10% rate of missingnes. At 15% rate of missingness, the exponentially weighted moving average outperformed the other techniques in terms of RMSE and Stine interpolation was the best method of imputation based on MAPE. At 25% rate of missingness, Kalman filtering with structural time series performed better than the other techniques based on both RMSE and MAPE. Kalmnan filtering with structural time series was the best in terms of RMSE, and Kalman filtering with ARIMA filtering was the best technique in based on MAPE at 35% of missingness.

Keywords

univariate time series

ARIMA

missing data

imputation 

Presenting Author(s)

Nicholas Niako, University of Texas Rio Grande Valley
Kristina Vatcheva, University of Texas Rio Grande Valley

First Author

Nicholas Niako, University of Texas Rio Grande Valley

CoAuthor(s)

Kristina Vatcheva, University of Texas Rio Grande Valley
Jesus Melgarejo, Studies Coordinating Centre, Research Unit Hypertension and Cardiovascular Epidemiology, KU Leuven
Gladys Maestre, Rio Grande Valley Alzheimer’s Disease Resource Center for Minority Aging Research (RGV AD-RCMAR),

Tracks

Implementation and Analysis
Conference on Statistical Practice (CSP) 2023