Distributed Tensor Principal Component Analysis with Data Heterogeneity

Xi Chen Co-Author
New York University
 
Wenbo Jing Co-Author
 
Yichen Zhang Co-Author
Purdue University
 
Elynn Chen First Author
 
Wenbo Jing Presenting Author
 
Thursday, Aug 7: 11:05 AM - 11:20 AM
1260 
Contributed Papers 
Music City Center 
As tensors become widespread in modern data analysis, Tucker low-rank Principal Component Analysis (PCA) has become essential for dimensionality reduction and structural discovery in tensor datasets. Motivated by the common scenario where large-scale tensors are distributed across diverse geographic locations, this paper investigates tensor PCA within a distributed framework where direct data pooling is theoretically suboptimal or practically infeasible. We offer a comprehensive analysis of three specific scenarios in distributed Tensor PCA: a homogeneous setting in which tensors at various locations are generated from a single noise-affected model; a heterogeneous setting where tensors at different locations come from distinct models but share some principal components, aiming to improve estimation across all locations; and a targeted heterogeneous setting, designed to boost estimation accuracy at a specific location with limited samples by utilizing transferred knowledge from other sites with ample data. We introduce novel estimation methods tailored to each scenario, establish statistical guarantees, and develop distributed inference techniques to construct confidence regions.

Keywords

Tensor Principal Component Analysis

Distributed Inference

Data Heterogeneity

Communication Efficiency

Tucker Decomposition 

Main Sponsor

Section on Statistical Learning and Data Science