Identifying Classification Thresholds in Shuffled Stochastic Block Models
Thursday, Aug 6: 10:30 AM - 12:20 PM
3535
Contributed Papers
Traditional classification methods like k-nearest neighbors (kNN) are widely used in practical applications and have demonstrated effectiveness under the assumption of well-observed networks with known labels. However, in practice, networks are frequently not fully observed due to anonymization, data collection inaccuracies, or missing information, resulting in estimated or entirely unknown node labels. This lack of information could compromise statistical inference if methods heavily rely on label-specific attributes. We investigate the impact of node shuffling on classification performance within a Stochastic Block Model framework. Specifically, we use kNN combined with Procrustes alignment of latent positions to classify graphs from two groups differing by a perturbation. Our empirical and theoretical results reveal that in the homogeneous case, the classification rate declines with increasing shuffled vertices. However, for a large enough perturbation, a change point occurs at which the classification rate resurges. Notably, a reflection is observed in the Procrustes alignment at this point, which becomes more pronounced with increasing perturbation.
Stochastic Block Model (SBM)
Node shuffling
k-Nearest Neighbors (kNN)
Graph Classification
Procrustes Alignment
Adjacency Spectral Embedding (ASE)
Main Sponsor
Section on Statistical Learning and Data Science
You have unsaved changes.