Principal Subsimplex Analysis
James Marron
Co-Author
University of North Carolina at Chapel Hill
Andrew Wood
Co-Author
The Australian National University
Sunday, Aug 3: 5:20 PM - 5:35 PM
2204
Contributed Papers
Music City Center
Compositional data, also referred to as simplicial data, naturally arise in many scientific domains such as geochemistry, microbiology, and economics. In such domains, obtaining sensible lower-dimensional representations and modes of variation plays an important role. A typical approach to the problem is applying a log-ratio transformation followed by principal component analysis (PCA). However, this approach has several notable weaknesses: it amplifies variation in minor variables and obscures those in major elements, is not directly applicable to data sets containing zeros, and has limited ability to capture linear patterns. We propose novel methods that produce nested sequences of simplices of decreasing dimensions using the backwards principal component analysis framework. These nested sequences offer both interpretable lower dimensional representations and linear modes of variation. In addition, our methods are applicable to data sets contain zeros without any modification. Our methods are demonstrated on simulated data and on relative abundances of diatom species during the late Pliocene.
Modes of variation
Backwards approach
Nested relations
Compositional data
Paleoceanography
Main Sponsor
Section on Statistical Learning and Data Science
You have unsaved changes.