High-Dimensional Tensor Statistical Learning: New Frontiers in AI and Data Science

Yuefeng Han Chair
 
Yuefeng Han Organizer
 
Elynn Chen Organizer
 
Tuesday, Aug 5: 2:00 PM - 3:50 PM
0760 
Topic-Contributed Paper Session 
Music City Center 
Room: CC-106A 

Applied

No

Main Sponsor

International Chinese Statistical Association

Co Sponsors

IMS
Section on Statistical Learning and Data Science

Presentations

Statistical Inference for Low-Rank Tensor Models

Statistical inference for tensors has emerged as a critical challenge in analyzing high-dimensional data in modern data science. This paper introduces a unified framework for inferring general and low-Tucker-rank linear functionals of low-Tucker-rank signal tensors for several low-rank tensor models. Our methodology tackles two primary goals: achieving asymptotic normality and constructing minimax-optimal confidence intervals. By leveraging a debiasing strategy and projecting onto the tangent space of the low-Tucker-rank manifold, we enable inference for general and structured linear functionals, extending far beyond the scope of traditional entrywise inference. Specifically, in the low-Tucker-rank tensor regression or PCA model, we establish the computational and statistical efficiency of our approach, achieving near-optimal sample size requirements (in regression model) and signal-to-noise ratio (SNR) conditions (in PCA model) for general linear functionals without requiring sparsity in the loading tensor. Our framework also attains both computationally and statistically optimal sample size and SNR thresholds for low-Tucker-rank linear functionals. Numerical experiments validate our theoretical results, showcasing the framework's utility in diverse applications. This work addresses significant methodological gaps in statistical inference, advancing tensor analysis for complex and high-dimensional data environments. 

Speaker

Yuefeng Han

Advances in Statistical Machine Learning for Tensors: Parametric and Non-Parametric Approaches

High-order tensor datasets present unique challenges in recommendation systems, neuroimaging, and social networks. In this talk, we discuss our recent advancements in developing statistical models, efficient algorithms, and data-driven solutions to address high-dimensional tensor problems. Specifically, we introduce two key approaches: parametric tensor block models for higher-order clustering and nonparametric latent variable models for tensor denoising. We establish both statistical and computational guarantees for each method. Polynomial-time algorithms are developed with proven efficiency. The practical utility of our methods is demonstrated through neuroimaging data application and social network studies.  

Keywords

Higher-order tensors, high-dimensional statistics, statistical-computational efficiency, parametric, non-parametric, clustering, denoising, 

Speaker

Miaoyan Wang, University of Wisconsin-Madison

Tensor Data Analysis and Some Applications in Neuroscience

Multidimensional arrays, or tensors, are becoming increasingly prevalent in a wide range of scientific applications. In this talk, I will present two case studies from neuroscience, where tensor decomposition proves particularly useful. The first study is a cross-area neuronal spike trains analysis, which we formulate as the problem of regressing a multivariate point process on another multivariate point process. We model the predictor effects through the conditional intensities using a set of basis transferring functions in a convolutional fashion. We then organize the corresponding transferring coefficients in the form of a three-way tensor, and impose the low-rank, sparsity, and subgroup structures on this coefficient tensor. The second study is a multimodal neuroimaging analysis for Alzheimer's disease, which we formulate as the problem of modeling the correlations of two sets of variables conditioning on the third set of variables. We propose a generalized liquid association analysis method to study such three-way associations. We establish a population dimension reduction model, and transform the problem to sparse decomposition of a three-way tensor. 

Speaker

Lexin Li, University of California-Berkeley

Tucker Decomposition with Structured Core: Identifiability, Stability and Computability

We consider the Tensor Tucker decomposition and show that it is uniquely identified up to sign and permutation of the columns of the component matrices, and is stable under small perturbations, when the core tensor satisfies certain structural support conditions. When affected by noise, we get stand-alone error bounds of each column, unaffected by the others. We show that if the core of a higher order tensor consists of random entries, the uniqueness and stability properties hold with high probability even when the elements of the core tensor are nonzero with probability close to but bounded away from one. We also furnish algorithms for performing tensor decompositions in these settings. From an application perspective, our results are useful in making inference about paired latent variable models and can be related to Kronecker-product dictionary learning. 

Speaker

Arnab Auddy, The Ohio State University

TEMPTED: Time-informed dimensionality reduction for longitudinal microbiome studies

Longitudinal studies are crucial for understanding complex microbiome dynamics and their link to health. In this talk, we introduce TEMPoral TEnsor Decomposition (TEMPTED), a time-informed dimensionality reduction method for high-dimensional longitudinal data that treats time as a continuous variable, effectively characterizing temporal information and handling varying temporal sampling. TEMPTED captures key microbial dynamics, facilitates beta-diversity analysis, and enhances reproducibility by transferring learned representations to new data. In simulations, it achieves 90% accuracy in phenotype classification, significantly outperforming existing methods. In real data, TEMPTED identifies vaginal microbial markers linked to term and preterm births, demonstrating robust performance across datasets and sequencing platforms. 

Keywords

microbiome

tensor 

Co-Author(s)

Anru Zhang, Duke University
Pixu Shi, Duke University
Rungang Han, Duke University

Speaker

Anru Zhang, Duke University