Addressing Challenges in Variable Selection of Multinomial Models with Proposed L0L2 Regularization

Soumya Sahu Co-Author
 
Dulal Bhaumik Co-Author
University of Illinois At Chicago
 
Avisek Datta First Author
 
Avisek Datta Presenting Author
 
Wednesday, Aug 6: 8:35 AM - 8:50 AM
1540 
Contributed Papers 
Music City Center 
Multinomial regression is a powerful tool for modeling categorical outcomes of two or more classes. However, several challenges include information loss from categorization, & increased complexity from multiple linear models, leading to parameter inflation. Existing variable selection techniques improve model sparsity but struggle with more sparser data, often missing true signals & introduce false positives. L0-norm regularization induces exact sparsity, but is computationally prohibitive due to its non-convex, NP-hard nature. Existing software for L0 is slow, & higher data complexity worsens inefficiency. To address these challenges, we propose an L0L2 multinomial logistic regression algorithm enabling precise feature selection while maintaining computational feasibility. Our approach integrates a systematic swapping mechanism to enhance optimization & employs Iterative Reweighted Least Squares (IRLS) to enhance efficiency. This proposal is highly motivated by our real-world genetic dataset, consisting of several hundred SNP predictors associated with multi-category mental health outcomes exposed to traumatic events, mental health disorders, & substance use disorders.

Keywords

high dimensional, multinomial

sparsity, feature selection

optimization

efficiency

IRLS

L0L2 regularization 

Main Sponsor

Mental Health Statistics Section