Print Close

Spectral Methods in Network Data Analysis

Uditha Wijesuriya Chair
University of Southern Indiana

Thursday, Aug 6: 10:30 AM - 12:20 PM
6264
Contributed Papers

Main Sponsor

Section on Statistical Learning and Data Science

Presentations

Adjacency Dependent Mixture Dirichlet Process enhanced Adaptive ROI Detection with 3D CNN on DTI

Despite remarkable advances in neuroimaging, current analytical frameworks still face challenges in achieving two essential goals: clinical interpretability and computational efficiency, particularly when handling high-dimensional brain data. Although 2D approaches remain widely used, slice-by-slice analysis often fails to capture volumetric continuity, limiting the detection of subtle abnormalities across slices. Conversely, fully 3D CNN-based models demand excessive computation and memory. To overcome these limitations, we propose a 3D Adaptive Spatial Key-Region Identification (ASKRI) method that achieves both interpretability and efficiency. In this framework, key regions are adaptively enhanced within a Restricted Adjacency-Dependent Mixture Dirichlet Process model, improving interpretability while supporting clinical diagnostics. Applied to brain imaging, the method not only identifies key regions(e.g. fornix) with high classification accuracy but also isolates clinically meaningful and diagnostically informative ROIs, thereby providing a time-efficient and reliable tool for neuroimaging analysis.

Keywords

3D Convolutional Neural Networks

Adaptive Spatial Key-Region Identification (ASKRI)

Universal Kriging

Restricted Adjacency Matrix

Diffusion Tensor Imaging (DTI)

View Abstract 1878

Speaker

HyunAh Lee

Co-Author

Jihnhee Yu, University at Buffalo, SUNY

Adjacency Spectral Embeddings of Correlation Networks

In many applications, weighted networks are constructed based on time series data. A time series is associated to each vertex, and edge weights are given by correlations between times series. This results in dependency among the edges, violating the assumptions of most common network models. Nonetheless, it is common to apply network embedding methods to networks built from correlation data. In this work, we show that this violation of assumptions is not critical. Provided that the time series under study are expressible in terms of a small number of orthogonal sequences, the adjacency spectral embedding provably recovers the true time series. That is, the adjacency spectral embedding applied to correlation networks serves as a denoising process, analogous to principal components analysis. In addition, we show that under suitable sparsity assumptions on the frequency domain, the embedding learned the adjacency spectral embedding recovers the Fourier coefficients of the true signals. This fact appears to be folklore in the signal processing community in the context of principle component analysis, but it is, to the best of our knowledge, new to the networks literature.

Keywords

Networks

Embeddings

Correlation matrix

Spectral methods

Time series

View Abstract 2181

Speaker

Keith Levin, University of Wisconsin

Comparing Two Categorical Gini Correlations with Applications to Classification Problems

We introduce a general inferential framework for comparing predictor importance in classification models with categorical responses. Our approach is based on the categorical Gini correlation (CGC), a dependence measure between numerical and categorical variables that captures the significance of a predictor for the response. To compare the importance of two predictors with respect to the same categorical outcome, we conduct hypothesis tests on their CGCs. The framework accommodates predictors of arbitrary and unequal dimensionalities. We derive the asymptotic distribution of the test statistic for hypothesis testing and show that the test is consistent. In addition, we propose a nonparametric bootstrap procedure as an alternative to the asymptotic normal-based test. Simulation studies demonstrate the empirical performance of the proposed tests, and applications to two real datasets illustrate their practical utility.

Keywords

categorical Gini correlation

comparing correlations

classification

Predictor importance

Categorical response

Nonparametric bootstrap

View Abstract 3739

Speaker

Sameera Hewage, University of Louisiana at Lafayette

Co-Author

Yongli Sang, University of Louisiana at Lafayette

Echo State Network Forecast Model For Preliminary Estimation of The Chained CPI-U

Calculation of the Chained CPI-U requires monthly item-area expenditure shares from the same time-period as item-area price index relatives. Yet, cell-level expenditure data only become available four quarters after the price data. In the interim, the BLS issues a preliminary estimate of the index using a Constant Elasticity of Substitution model. We propose an alternative method for preliminary estimation that instead retains the Tornqvist formula for aggregation and forecasts the missing item-area expenditure data using a set of hierarchical Echo State Networks (ESNs), a class of Recurrent Neural Networks in which reservoir and input couplings are randomized.

ESNs are flexible, nonlinear, hidden variable models that can predict series with complex temporal dynamics after a relatively simple training process. We develop an iterative procedure to forecast a vector of item expenditures for a given area based on its past expenditure data as well as past and concurrent price data. Additionally, we include the option to supplement the ESN neuron states with discrete Fourier modes at the seasonal frequencies to improve prediction among items with strong seasonal components.

Keywords

Time series

Price indices

Echo State Networks

Neural Networks

View Abstract 2267

Speaker

Kate Eckerle, Bureau of Labor Statistics

Co-Author(s)

Erin Boon, U.S. Bureau of Labor Statistics
Daniell Toth, US Bureau of Labor Statistics

Identifying Classification Thresholds in Shuffled Stochastic Block Models

Traditional classification methods like k-nearest neighbors (kNN) are widely used in practical applications and have demonstrated effectiveness under the assumption of well-observed networks with known labels. However, in practice, networks are frequently not fully observed due to anonymization, data collection inaccuracies, or missing information, resulting in estimated or entirely unknown node labels. This lack of information could compromise statistical inference if methods heavily rely on label-specific attributes. We investigate the impact of node shuffling on classification performance within a Stochastic Block Model framework. Specifically, we use kNN combined with Procrustes alignment of latent positions to classify graphs from two groups differing by a perturbation. Our empirical and theoretical results reveal that in the homogeneous case, the classification rate declines with increasing shuffled vertices. However, for a large enough perturbation, a change point occurs at which the classification rate resurges. Notably, a reflection is observed in the Procrustes alignment at this point, which becomes more pronounced with increasing perturbation.

Keywords

Stochastic Block Model (SBM)

Node shuffling

k-Nearest Neighbors (kNN)

Graph Classification

Procrustes Alignment

Adjacency Spectral Embedding (ASE)

View Abstract 3535

Speaker

Vera Andersson, University of Maryland, College Park

Co-Author

Vince Lyzinski, University of Maryland

Inference for High-dimensional Sparse Spectral Precision Matrices

Gaussian graphical models in spectral domain provide a principled framework for identifying conditional dependence structures in stationary high-dimensional time series. Inference for the spectral precision matrix (SPM) at fixed frequency is challenging because estimation requires smoothing across frequencies, while spectral-domain observations, i.e. discrete Fourier transforms, are only asymptotically independent, have non-sparse precision matrices, and exhibit finite-sample biases that invalidate standard i.i.d. precision matrix inference. We propose an inference framework for sparse high-dimensional SPMs. Our method constructs a debiased complex graphical lasso (deCGLASSO) estimator at a specified frequency. Using asymptotic theory for quadratic forms of stationary multivariate time series, we establish asymptotic normality of the debiased estimator. For each matrix entry, we develop an estimator of the asymptotic covariance by aggregating information across neighboring frequencies. The key theoretical contribution is explicit control of the regularization, truncation bias and smoothing bias. We demonstrate the method's empirical performance on simulated data and real fMRI data.

Keywords

Graphical models

Precision matrix estimation

High-dimensional time series

Spectral domain inference

Debiased estimators

Confidence intervals

View Abstract 3684

Speaker

Navonil Deb

Co-Author(s)

Younghoon Kim, Cornell University
Sumanta Basu, Cornell University

Statistically and Computationally Optimal Estimation and Inference of Common Subspaces

Given multiple data matrices, many problems in statistics and data science rely on estimating a common subspace that captures certain structure shared by all the data matrices. In this talk we investigate the statistical and computational limits for the common subspace model in which one observes a collection of symmetric low-rank matrices perturbed by noise, where each low-rank matrix shares the same common subspace. Our main results identify several regimes of the signal-to-noise ratio (SNR) such that estimation and inference is statistically or computationally optimal, and we refer to these regimes as weak SNR, moderate SNR, strong estimation SNR, and strong inference SNR. Consequently, our results unveil a novel phenomenon: despite the SNR being ``above'' the computational limit for estimation, adaptive statistical inference may still be information-theoretically impossible.

Keywords

Spectral methods

Multilayer networks

Matrix analysis

Random matrix theory

View Abstract 3227

Speaker

Joshua Agterberg