50 Classification of high-dimensional data using powerful tests and Bayes error rate
Tuesday, Aug 6: 10:30 AM - 12:20 PM
3892
Contributed Posters
Oregon Convention Center
One of the main aims of data modeling is to find the best classifier for new cases. For example, based on the gene expression of a new case, we can classify it as one of two groups. The high dimensionality of the dataset is the main restriction for finding an accurate and non-complex model. Therefore, the similar genes in the two groups are removed to reduce the dimension. Candidate genes are selected according to the family-wise error rate (FEWR) and used to find the best classifier. Zhang and Deng [1] proposed an additional step in removing the genes with redundant or highly correlated information before finding the best classifier. They find more effective and non-redundant genes using the Bayes error rate (BER). They used Bhattacharya bound to estimate BER because BER was not computable at that time. They show that this additional step improves classification accuracy. In this work, we improve the classification accuracy by computing exact BER [2] and using uniformly most powerful unbiased test [3] for calculating FWER.
Bayes error rate
Microarray data
Gene selection
Classification
Permutation test
Uniformly most powerful unbiased test
You have unsaved changes.