Profiling Arthropods of the World with Factorization-Derived Indicators
Sunday, Aug 3: 5:35 PM - 5:50 PM
2282
Contributed Papers
Music City Center
Modern, semi-autonomous biomonitoring programs are producing massive datasets on global biodiversity. The data frequently include rich information on tens of thousands of species, many of which are largely unstudied. Collection, individual or bulk identification, and subsequent modeling of such massive data are extremely resource-intensive tasks. There is therefore growing interest in simplifying the analysis pipeline by using indicator species: a subset of species which reflect overall ecosystem health, the presence of specific habitats, or reflect the distributions of unmeasured species. We propose a model-based approach to learning site and species clusters from abundance data and selecting indicator species on a per-cluster basis. To address the added challenge of modeling hyper-sparse, high-dimensional counts with large values, we propose a hierarchical nonnegative matrix factorization that combines recent developments to infer the factorization rank and flexibly attribute abundances to different factors. Indicators are selected based on their ability to predict other species belonging to the same cluster. We showcase this workflow on a large assemblage of arthropods collected as part of the Global Malaise Trap program.
Abundance data
Matrix factorization
Decision theory
Overdispersion
Ecology
Main Sponsor
Section on Statistics and the Environment
You have unsaved changes.