Structure Learning and Dimension Reduction

Hon Yiu So Chair
Oakland University
 
Sunday, Aug 3: 4:00 PM - 5:50 PM
4029 
Contributed Papers 
Music City Center 
Room: CC-102B 

Main Sponsor

Section on Statistical Learning and Data Science

Presentations

Association Structure Learning in Multivariate Categorical Response Regression

Modeling the complex relationships between multiple categorical response variables as a function of predictors is a fundamental task in the analysis of categorical data. However, existing methods can be difficult to interpret and may lack flexibility. To address these challenges, we introduce a penalized likelihood method for multivariate categorical response regression that relies on a novel subspace decomposition to parameterize interpretable association structures. Our approach models the relationships between categorical responses by identifying mutual, joint, and conditionally independent associations, which yields a linear problem within a tensor product space. We establish theoretical guarantees for our estimator, including error bounds in high-dimensional settings, and demonstrate the method's interpretability and prediction accuracy through comprehensive simulation studies. 

Keywords

multinomial logistic regression

categorical data analysis

log-linear models 

Co-Author(s)

Adam Rothman, University of Minnesota
Aaron Molstad, University of Minnesota

First Author

Hongru Zhao

Presenting Author

Hongru Zhao

Belted and Ensembled Neural Network for Linear and Nonlinear Sufficient Dimension Reduction

We introduce a unified, flexible, and easy-to-implement framework of sufficient dimension reduction that can accommodate both linear and nonlinear dimension reduction, and both the conditional distribution and the conditional mean as the targets of estimation. This unified framework is achieved by a specially structured neural network - the Belted and Ensembled Neural Network (BENN) - that consists of a narrow latent layer, which we call the belt, and a family of transformations of the response, which we call the ensemble. By strategically placing the belt at different layers of the neural network, we can achieve linear or nonlinear sufficient dimension reduction, and by choosing the appropriate transformation families, we can achieve dimension reduction for the conditional distribution or the conditional mean. Moreover, thanks to the advantage of the neural network, the method is very fast to compute, overcoming a computation bottleneck of the traditional sufficient dimension reduction estimators, which involves the inversion of a matrix of dimension either p or n. We develop the algorithm and convergence rate of our method and compare it with existing SDR methods. 

Keywords

Autoencoder

Convergence rate

Covering numbers

Deep learning

Probability characterizing family 

Co-Author

Bing Li, Penn State University

First Author

Yin Tang

Presenting Author

Yin Tang

Covariate-assisted Grade of Membership Model

The Grade of Membership (GoM) model is a popular individual-level mixture model for multivariate categorical data such as survey responses. In modern data collection, numerous covariates are often gathered alongside the target response data, many of which share a similar latent structure. To leverage this covariate information for improved estimation of the latent structure of the target data, we introduce Covariate-assisted Grade of Membership (CoGoM) models and develop an efficient estimation algorithm based on spectral methods. For model identifiability, we establish a weaker sufficient condition compared to the covariate-free case. For theoretical guarantee, we show consistency in high-dimensional settings, demonstrating how incorporating covariates can aid the estimation of the latent structure. Through simulation studies, our proposed method outperforms traditional approaches in terms of both computation efficiency and estimation accuracy. Finally, we demonstrate our method by applying it to a Trends in International Mathematics and Science Study (TIMSS) dataset. 

Keywords

Grade of Membership Model

Identifiability

Sequential Projection Algorithm

Covariate Assistance

Spectral Method 

Co-Author

Yuqi Gu, Columbia University

First Author

Zhiyu Xu, Columbia University

Presenting Author

Zhiyu Xu, Columbia University

Federated multimodal learning with heterogeneous modality and distribution shift

Federated learning enables the analysis of multi-site real-world data (RWD) while preserving data privacy, yet challenges persist due to heterogeneous modality availability and distribution shifts across sites. In this work, we develop a novel federated multimodal learning framework to improve causal inference in distributed research networks (DRNs), integrating electronic health records (EHRs) and genetic biomarkers. Traditional methods often fail to account for structural missingness and site-specific heterogeneity, leading to biased estimates of treatment effects.
To address this, we propose a new statistical framework that accounts for distribution shifts of populations across sites, while pursuing efficiency and bias correction by leveraging information from all available modalities across sites. In addition, we employ multiple negative control outcomes to calibrate estimates and mitigate residual systematic biases, including unmeasured confounding. 

Keywords

Causal inference

Negative control outcomes

Average treatment effect

Bias Correction

Multi-Modality 

Co-Author(s)

Huiyuan Wang, University of Pennsylvania
Jingyue Huang
Yong Chen, University of Pennsylvania, Perelman School of Medicine

First Author

Dazheng Zhang

Presenting Author

Dazheng Zhang

Meta-Fusion: A Unified Framework For Multi-modality Fusion with Mutual Learning

Multi-modal data fusion has become increasingly critical for enhancing the predictive power of machine learning methods across diverse fields, from autonomous driving to medical diagnosis. Traditional fusion methods—early fusion, intermediate fusion, and late fusion—approach data integration differently, each with distinct advantages and limitations. In this paper, we introduce Meta-Fusion, a flexible and principled framework that unifies these existing approaches as special cases. Drawing inspiration from mutual deep learning and ensemble learning, Meta-Fusion constructs a cohort of models based on various combinations of latent representations across modalities, and further enhances predictive performance through soft information sharing within the cohort. Our approach is model-agnostic in learning the latent representations, allowing it to flexibly adapt to the unique characteristics of each modality. Theoretically, our soft information sharing mechanism effectively reduces the generalization error. Empirically, Meta-Fusion consistently outperforms conventional fusion strategies in extensive synthetic experiments. We further validate our approach on real-world applications, including Alzheimer's disease detection and brain activity analysis. 

Keywords

multi-modality fusion

deep mutual learning

ensemble learning

soft information sharing 

Co-Author(s)

Annie Qu, University of California At Irvine
Babak Shahbaba, UCI

First Author

Ziyi Liang

Presenting Author

Ziyi Liang

Principal Subsimplex Analysis

Compositional data, also referred to as simplicial data, naturally arise in many scientific domains such as geochemistry, microbiology, and economics. In such domains, obtaining sensible lower-dimensional representations and modes of variation plays an important role. A typical approach to the problem is applying a log-ratio transformation followed by principal component analysis (PCA). However, this approach has several notable weaknesses: it amplifies variation in minor variables and obscures those in major elements, is not directly applicable to data sets containing zeros, and has limited ability to capture linear patterns. We propose novel methods that produce nested sequences of simplices of decreasing dimensions using the backwards principal component analysis framework. These nested sequences offer both interpretable lower dimensional representations and linear modes of variation. In addition, our methods are applicable to data sets contain zeros without any modification. Our methods are demonstrated on simulated data and on relative abundances of diatom species during the late Pliocene. 

Keywords

Modes of variation

Backwards approach

Nested relations

Compositional data

Paleoceanography 

Co-Author(s)

James Marron, University of North Carolina at Chapel Hill
Janice Scealy, Australian National University
Andrew Wood, The Australian National University
Eric Grunsky, University of Waterloo
Kassel Hingee, Australian National University

First Author

Hyeon Lee

Presenting Author

Hyeon Lee

Revisit CP Tensor Decomposition: Statistical Optimality and Fast Convergence

We introduce a statistical and computational framework for tensor Canonical Polyadic (CP) decomposition, with a focus on statistical theory, convergence, and algorithmic improvements. First, we show that the Alternating Least Squares (ALS) algorithm achieves the desired error rate within three iterations when $R = 1$. Second, for the more general case where $R > 1$, we derive statistical bounds for ALS, showing that the estimation error exhibits an initial phase of quadratic convergence followed by linear convergence until reaching the desired accuracy. Third, we propose a novel warm-start procedure for ALS in the $R > 1$ setting, which integrates tensor Tucker decomposition with simultaneous diagonalization (Jennrich's algorithm) to significantly enhance performance over existing benchmark methods. Numerical experiments support our theoretical findings, demonstrating the practical advantages of our approach. 

Keywords

tensor

CP decomposition

alternative least square

statistical bound 

Co-Author(s)

Julien Chhor
Olga Klopp
Anru Zhang, Duke University

First Author

Runshi Tang

Presenting Author

Runshi Tang