Addressing Heterogeneity in High-Dimensional Regression through Bayesian Structured Sparse Clustering

Maoran Xu Speaker
 
Tuesday, Aug 5: 11:00 AM - 11:25 AM
Invited Paper Session 
Music City Center 
In many high-dimensional regression settings, it is appealing to impose low-dimensional structures on the coefficients. Additionally, clustering the coefficients helps uncover latent groups that reflect heterogeneity in the relationship between covariates and outcomes.
Clustering such high-dimensional data with low-dimensional constraints poses computational challenges, especially when using optimization methods due to the nonconvex nature of the mixture problem. While Bayesian methods offer a natural framework for sampling from the mixture model and quantifying uncertainty, specifying the prior remains difficult: spike-and-slab priors introduce computational complexity in sampling, whereas continuous shrinkage priors are ineffective at inducing the exact sparsity within mixture models. To address these challenges, we propose an optimization-driven structural sparse prior within a nonparametric Bayesian clustering approach. The hierarchical prior structure enables an efficient and straightforward Gibbs sampler. From a theoretical standpoint, we establish consistency results, both in terms of optimal parameter recovery rates and clustering accuracy. We illustrate the effectiveness of the proposed method through a compositional regression task, applying it to the analysis of GDP contributions from multiple industries across 51 states.

Keywords

Bayesian Nonparametrics

Dimension Reduction

Compositional Regression