Efficient Nonparametric Two-Sample Hypothesis Testing Methods for Large Networks via Subsampling

Srijan Sengupta Co-Author
North Carolina State University
 
Yuguo Chen Co-Author
University of Illinois at Urbana-Champaign
 
Kaustav Chakraborty First Author
 
Kaustav Chakraborty Presenting Author
 
Tuesday, Aug 5: 8:50 AM - 9:05 AM
2314 
Contributed Papers 
Music City Center 
We examine two-sample hypothesis testing in random networks within the Random Dot Product Graph (RDPG) framework, and develop a time-efficient algorithm. We distinguish between semiparametric and nonparametric testing, emphasizing the latter for its flexibility and independence from network size. The nonparametric approach assumes that vertex interactions are governed by exchangeable latent distances, and the central question is whether the latent distance distributions differ between two networks. To address this, a U-statistic-based test statistic approximating maximum mean discrepancy is used, which is computationally complex for large networks. Given the challenge, we introduce a subsampling-based method that partitions large networks, analyzes smaller subgraphs, and aggregates the results. Our contributions include designing a subsampling-based latent position estimator and validating a bootstrap-based testing procedure, as well as developing several faster divide-and-conquer testing methods. This work advances efficient and consistent network analysis, with broad applicability across diverse domains.

Keywords

Two-sample hypothesis testing

Network model

Nonparametric testing

Subsampling

Time efficient algorithm

Random Dot Product Graph 

Main Sponsor

Section on Statistical Computing