MTL-MICE: Multi-Task Learning in Multiple Imputation by Chained Equation

Yang Feng Co-Author
New York University
 
Yuyu(Ruby) Chen First Author
New York University
 
Yuyu(Ruby) Chen Presenting Author
New York University
 
Tuesday, Aug 5: 11:35 AM - 11:50 AM
1562 
Contributed Papers 
Music City Center 
High-dimensional datasets are common in healthcare and public health, where multi-center electronic health records (EHRs) and national surveys pose complex missing data challenges. Traditional imputation methods struggle in these settings, as they handle missing values independently for each task. To address this, we propose Multi-Task Learning via Multiple Imputation by Chained Equations (MTL-MICE), a novel approach that enhances imputation by leveraging shared information across tasks.
MTL-MICE integrates multi-task learning into the MICE framework, capturing correlations among tasks to improve accuracy and robustness. Instead of treating missing data separately, it utilizes shared relationships across features. Additionally, we incorporate a transferable source detection technique to identify informative tasks, refining imputation further.
Through simulations and real-world studies, we show that MTL-MICE significantly reduces imputation error and bias compared to single-task methods while preserving MICE's flexibility. These findings highlight the potential of multi-task learning to improve missing data methodologies for large-scale, high-dimensional studies.

Keywords

Multi-Tasking Learning

Missing Data

Multiple Imputation

Transfer learning

High-dimensional inference

Lasso 

Main Sponsor

Section on Statistical Learning and Data Science