Structure Identification and Dimension Reduction Methods

Subhadeep Paul Chair
The Ohio State University
 
Thursday, Aug 7: 10:30 AM - 12:20 PM
4231 
Contributed Papers 
Music City Center 
Room: CC-207C 

Main Sponsor

Section on Statistical Learning and Data Science

Presentations

Chain-linked multiple matrix integration via embedding alignment

Motivated by the increasing demand for multi-source data integration in various scientific fields, in this paper we study matrix completion in scenarios where the data exhibits certain block-wise missing structures -- specifically, where only a few noisy submatrices representing (overlapping) parts of the full matrix are available. We propose the Chain-linked Multiple Matrix Integration (CMMI) procedure to efficiently combine the information that can be extracted from these individual noisy submatrices. CMMI begins by deriving entity low-rank embeddings for each observed submatrix, then aligns these embeddings using overlapping entities between pairs of submatrices, and finally aggregates them to reconstruct the entire matrix of interest. We establish, under mild regularity conditions, entrywise error bounds and normal approximations for the CMMI estimates. Simulation studies and real data applications show that CMMI is computationally efficient and effective in recovering the full matrix, even when overlaps between the observed submatrices are minimal. 

Keywords

2→∞ norm

normal approximations

matrix completion

data integration 

Co-Author

Minh Tang, North Carolina State University

First Author

Runbing Zheng, Johns Hopkins University

Presenting Author

Runbing Zheng, Johns Hopkins University

Dimension reduction in semi-supervised multiple quantile regression

In this work, we propose a new semi-supervised method for multiple quantile regression method . Traditional multiple quantile regression methods often have the problem of quantile crossing, where a lower quantile estimate ends up being higher than a larger quantile estimate. To address this, we introduce a non-crossing penalty term that enforces the natural ordering of quantiles. Our framework natural allows for regularization of the regression coefficient matrix. To compute our estimator, we utilize a splitting algorithm. In simulation studies, we show that our method can lead to improved performance over existing estimators. 

Keywords

Alternating direction method of multipliers

Constrained optimization

Quantile regression

Dimension reduction 

Co-Author(s)

Aaron Molstad, University of Minnesota
Ben Sherwood, University of Kansas

First Author

Youngwoo Kwon

Presenting Author

Youngwoo Kwon

Distributed Tensor Principal Component Analysis with Data Heterogeneity

As tensors become widespread in modern data analysis, Tucker low-rank Principal Component Analysis (PCA) has become essential for dimensionality reduction and structural discovery in tensor datasets. Motivated by the common scenario where large-scale tensors are distributed across diverse geographic locations, this paper investigates tensor PCA within a distributed framework where direct data pooling is theoretically suboptimal or practically infeasible. We offer a comprehensive analysis of three specific scenarios in distributed Tensor PCA: a homogeneous setting in which tensors at various locations are generated from a single noise-affected model; a heterogeneous setting where tensors at different locations come from distinct models but share some principal components, aiming to improve estimation across all locations; and a targeted heterogeneous setting, designed to boost estimation accuracy at a specific location with limited samples by utilizing transferred knowledge from other sites with ample data. We introduce novel estimation methods tailored to each scenario, establish statistical guarantees, and develop distributed inference techniques to construct confidence regions. 

Keywords

Tensor Principal Component Analysis

Distributed Inference

Data Heterogeneity

Communication Efficiency

Tucker Decomposition 

Co-Author(s)

Xi Chen, New York University
Wenbo Jing
Yichen Zhang, Purdue University

First Author

Elynn Chen

Presenting Author

Wenbo Jing

Extreme value theory for singular subspace estimation in the matrix denoising model

This paper studies fine-grained singular subspace inference in the matrix denoising model where a deterministic low-rank signal matrix is additively perturbed by a stochastic matrix of independent Gaussian noise. We establish that the maximum Euclidean row norm of the aligned difference between the top-$r$ sample and population singular vector matrices approaches the Gumbel distribution in the large-matrix limit under suitable signal-to-noise conditions after appropriate centering and scaling. Our main results are obtained by a novel synthesis of entrywise matrix perturbation theory and saddle point approximation methods in statistics. The theoretical developments in this paper lead to methodology for hypothesis testing low-rank signal structure encoded in the singular subspaces spanned by the top-$r$ singular vectors. To develop a data-driven inference procedure, shrinkage-type de-biased estimators are derived for the signal singular values. The features of our test include an asymptotic control over the size, and a power phase transition analysis under simple alternative structures. 

Keywords

Singular subspace inference

Two-to-infinity norm

Gumbel convergence

Saddle point approximation

Singular value shrinkage 

Co-Author

Joshua Cape, University of Wisconsin-Madison

First Author

Junhyung Chang, University of Wisconsin-Madison

Presenting Author

Junhyung Chang, University of Wisconsin-Madison

Layered Models can "Automatically" Discover Low-Dimensional Structures via Feature Learning

Layered models like neural networks appear to extract key features from data through empirical risk minimization, yet the theoretical understanding for this process remains unclear. Motivated by these observations, we study a two-layer nonparametric regression model where the input undergoes a linear transformation followed by a nonlinear mapping to predict the output, mir- roring the structure of two-layer neural networks. In our model, both layers are optimized jointly through empirical risk minimization, with the nonlinear layer modeled by a reproducing kernel Hilbert space induced by a rotation and translation invariant kernel, regularized by a ridge penalty.
Our main result shows that the two-layer model can "automatically'' induce regularization and facilitate feature learning. Specifically, the two-layer model promotes dimensionality reduction in the linear layer and identifies a parsimonious subspace of relevant features-even without applying any norm penalty on the linear layer. Notably, this regularization effect arises directly from the model's layered structure. Real-world data experiments further demonstrate the persistence of this phenomenon in practice. 

Keywords

layered models

regularization

feature learning

central mean subspace

reproducing kernel Hilbert space

ridge regression 

Co-Author(s)

Yang Li, Massachusetts Institute of Technology
Keli Liu
Feng Ruan, Northwestern University

First Author

Yunlu Chen, Northwestern University

Presenting Author

Yunlu Chen, Northwestern University

Random-projection ensemble dimension reduction

We propose a framework for dimension reduction in high-dimensional regression, by aggregating an ensemble of random projections selected based on empirical regression performance. Specifically, we consider disjoint groups of independent random projections, apply a base regression method after each projection is appied to the covariates, and retain the best-performing projection in each group. The selected projections are aggregated by taking the SVD of their empirical average, yielding the leading singular vectors. Notably, the singular values indicate the importance of the corresponding projection directions, aiding in selecting the final projection dimension. We provide recommendations on aspects of our framework, including the projection distribution, base regression method, and the number of random projections. Additionally, we explore further dimension reduction by applying our algorithm twice when the initially recommended dimension is too large. Our theoretical results show that the error of algorithm stabilises as the number of projection groups increases. We demonstrate our proposal's strong empirical performance through an extensive study using simulated and real data. 

Keywords

High-dimensional

mean central subspace

random projection

singular value decomposition

sufficient dimension reduction 

Co-Author

Timothy Cannings, University of Edinburgh

First Author

Wenxing Zhou

Presenting Author

Wenxing Zhou

Sparse Convex Biclustering

Biclustering is an unsupervised machine-learning technique that simultaneously clusters rows and columns in a data matrix. It has been gaining increasing attention over the past two decades driven by the increasing complexity and volume of data in fields like genomics, transcriptomics, and other high-throughput omics technologies. However, discovering significant bi-clusters in large-scale datasets is an NP-hard problem. The accuracy and stabilities of most existing biclustering algorithms decrease significantly as dataset size increases. That is mainly due to accumulation of noise in high dimension features and their non-convex optimization formulations. To address this, we propose a new method called sparse convex biclustering (SCB), which penalizes the noise to zero in the process of biclustering. A tuning criterion based on clustering stability is developed to optimally balance cluster fitting and sparsity. We conduct comprehensive numerical studies using simulated data to demonstrate the superior performance of SCB in comparison to several state-of-the-art alternatives. Furthermore, we apply our method to the analysis of mouse olfactory bulb (MOB) data. 

Keywords

Convex biclustering

Sparsity

ADMM

High-dimensional data 

Co-Author(s)

Chenliang Gu, Center for Statistics and Data Science, Beijing Normal University
Binhuan Wang, AbbVie

First Author

Jiakun Jiang, Beijing Normal University at Zhuhai

Presenting Author

Binhuan Wang, AbbVie