Print Close

A Comparison of Linear, Ensemble, and Neural Network Architectures for Estimating Healthcare Costs

Presented During: SPEED 6: Social Statistics, Mental Health Statistics, and Survey Methods, Part 1

Sabine Esmaili Speaker

Tuesday, Aug 4: 8:35 AM - 8:40 AM
3328
Contributed Speed

Thomas M. Menino Convention & Exhibition Center

Accurate estimation of individual medical costs is a cornerstone of business analytics in the insurance industry, however the relationship between demographic factors and actual expenditures is often non-linear. This study utilizes a publicly available medical cost dataset containing health indicators such as age, BMI, and smoking status to evaluate the predictive performance of three models. In this study, we compare a Multiple Linear Regression model, Random Forest, and a multi-layered Neural Net to determine if deep learning models provide a statistically significant improvement in Mean Absolute Error (MAE) and R-squared values over traditional frequentist approaches. While linear models offer high interpretability, they may fail to capture the non-linear cost interactions between high BMI and smoking status that are better explained by non-linear models. The findings aim to provide a practical framework for selecting the most efficient model for predicting health costs based on personal demographics, while balancing the trade-offs between model complexity and interpretability.

Keywords

Regression Analysis

Machine Learning

Healthcare Analytics

Neural Networks

Business Intelligence

Main Sponsor

Business and Economic Statistics Section