Print Close

Identifying Classification Thresholds in Shuffled Stochastic Block Models

Presented During: Spectral Methods in Network Data Analysis

Vera Andersson Speaker
University of Maryland, College Park

Vince Lyzinski Co-Author
University of Maryland

Thursday, Aug 6: 10:30 AM - 12:20 PM
3535
Contributed Papers

Traditional classification methods like k-nearest neighbors (kNN) are widely used in practical applications and have demonstrated effectiveness under the assumption of well-observed networks with known labels. However, in practice, networks are frequently not fully observed due to anonymization, data collection inaccuracies, or missing information, resulting in estimated or entirely unknown node labels. This lack of information could compromise statistical inference if methods heavily rely on label-specific attributes. We investigate the impact of node shuffling on classification performance within a Stochastic Block Model framework. Specifically, we use kNN combined with Procrustes alignment of latent positions to classify graphs from two groups differing by a perturbation. Our empirical and theoretical results reveal that in the homogeneous case, the classification rate declines with increasing shuffled vertices. However, for a large enough perturbation, a change point occurs at which the classification rate resurges. Notably, a reflection is observed in the Procrustes alignment at this point, which becomes more pronounced with increasing perturbation.

Keywords

Stochastic Block Model (SBM)

Node shuffling

k-Nearest Neighbors (kNN)

Graph Classification

Procrustes Alignment

Adjacency Spectral Embedding (ASE)

View Abstract 3535

Main Sponsor

Section on Statistical Learning and Data Science