29: Improving naive Bayes classifiers with high-dimensional non-Gaussian data

Gyuhyeong Goh Co-Author
Department of Statistics, Kyungpook National University
 
Dipak Dey Co-Author
University of Connecticut
 
Mijin Jeong First Author
Kyungpook National University
 
Mijin Jeong Presenting Author
Kyungpook National University
 
Monday, Aug 4: 2:00 PM - 3:50 PM
1191 
Contributed Posters 
Music City Center 
The naive Bayes classifier, which assumes the conditional independence of predictors, improves classification efficiency and has a great advantage in handling high-dimensional data as well as imbalanced data. However, the success of the naive Bayes classifier hinges on the normality assumption for each continuous predictor and its performance decreases considerably as many irrelevant predictor are included.
In this paper, we develop a way of improving the performance of naive Bayes classifiers when we deal with high-dimensional non-Gaussian data. To remove irrelevant predictors, we develop an efficient variable selection procedure in the context of naive Bayes classification using the notion of Bayesian Information Criteria (BIC). In addition, we adapt the naive Bayes classifier for use with non-Gaussian data via power transformation. We conduct a comparative simulation study to demonstrate the superiority of our proposed classifier over existing classification methods. We also apply our proposed classifier to real data and confirm its effectiveness.

Keywords

Bayes classifier

Generative classifier

High-dimensional variable selection

Power transformation 

Abstracts


Main Sponsor

Section on Statistical Learning and Data Science