Multivariate species sampling models
Tuesday, Aug 5: 11:25 AM - 11:50 AM
Invited Paper Session
Music City Center
Species sampling processes have long provided a fundamental framework for random discrete distributions and exchangeable sequences. However, analyzing data from distinct, yet related, sources, requires a broader notion of probabilistic invariance, with partial exchangeability as the natural choice. Over the past two decades, numerous models for partially exchangeable data, known as dependent nonparametric priors, have emerged, including hierarchical, nested, and additive processes. Despite their widespread use in Statistics and Machine Learning, a unifying framework remains elusive, leaving key questions about their learning mechanisms unanswered.
We fill this gap by introducing multivariate species sampling models, a general class of nonparametric priors encompassing most existing dependent nonparametric processes. These models are defined by a partially exchangeable partition probability function, encoding the induced multivariate clustering structure. We establish their core distributional properties and dependence structure, showing that borrowing of information across groups is entirely determined by shared ties. This provides new insights into their learning mechanisms, including a principled explanation for the correlation structure observed in existing models.
Beyond offering a cohesive theoretical foundation, our approach serves as a constructive tool for developing new models and opens new research directions aimed at capturing even richer dependence structures.
Bayesian Nonparametrics
Dependent nonparametric prior
Dirichlet process
Partial exchangeability
Pitman-Yor process
Random partition
You have unsaved changes.