Model Selection from Incomplete Data in Supervised and Unsupervised Learning

Giuseppe Vinci First Author
University of Notre Dame
 
Giuseppe Vinci Presenting Author
University of Notre Dame
 
Wednesday, Aug 6: 11:20 AM - 11:35 AM
2266 
Contributed Papers 
Music City Center 
Scientific datasets are often undermined by missing data, which can occur either randomly or structurally. Applying traditional supervised and unsupervised learning techniques to these incomplete datasets poses significant challenges. Model selection, in particular, becomes highly complex due to the impact on resampling methods and theoretical guarantees when dealing with partially observed random vectors. By leveraging resampling techniques, information theory, and stability measures, we propose novel approaches to model selection in supervised and unsupervised learning, with a particular focus on factor analysis and graphical modeling. We provide theoretical foundations and simulation results to demonstrate the effectiveness of these methods, along with applications to neuroscience and genomics.

Keywords

bayesian information criterion

cross-validation

missing data

sparsity

tuning parameter

variable selection 

Main Sponsor

IMS