Sparse Convex Biclustering
Chenliang Gu
Co-Author
Center for Statistics and Data Science, Beijing Normal University
Jiakun Jiang
First Author
Beijing Normal University at Zhuhai
Thursday, Aug 7: 12:05 PM - 12:20 PM
2259
Contributed Papers
Music City Center
Biclustering is an unsupervised machine-learning technique that simultaneously clusters rows and columns in a data matrix. It has been gaining increasing attention over the past two decades driven by the increasing complexity and volume of data in fields like genomics, transcriptomics, and other high-throughput omics technologies. However, discovering significant bi-clusters in large-scale datasets is an NP-hard problem. The accuracy and stabilities of most existing biclustering algorithms decrease significantly as dataset size increases. That is mainly due to accumulation of noise in high dimension features and their non-convex optimization formulations. To address this, we propose a new method called sparse convex biclustering (SCB), which penalizes the noise to zero in the process of biclustering. A tuning criterion based on clustering stability is developed to optimally balance cluster fitting and sparsity. We conduct comprehensive numerical studies using simulated data to demonstrate the superior performance of SCB in comparison to several state-of-the-art alternatives. Furthermore, we apply our method to the analysis of mouse olfactory bulb (MOB) data.
Convex biclustering
Sparsity
ADMM
High-dimensional data
Main Sponsor
Section on Statistical Learning and Data Science
You have unsaved changes.