Network Goodness-of-Fit for the block-model family

Tracy Ke Co-Author
Harvard University
 
Jingming Wang Co-Author
 
Jiashun Jin First Author
Carnegie Mellon University
 
Jiajun Tang Presenting Author
 
Wednesday, Aug 6: 2:50 PM - 3:05 PM
2218 
Contributed Papers 
Music City Center 
The block-model family includes four popular network models: SBM, DCBM, MMSBM, and DCMM. To evaluate how well these four models fit real networks, we propose GoF-MSCORE as a new Goodness-of-Fit metric for DCMM, based on two main ideas. The first is to use cycle count statistics as a general framework for GoF. The second is a novel network fitting scheme. Extending GoF-MSCORE to SBM, DCBM, and MMSBM results in a series of GoF metrics covering each of the four models in the block-model family. We show that for the four models, if the assumed model is correct, then as the network size diverges, the corresponding GoF metric converges to N(0,1), a parameter-free null limiting distribution. We also analyze the power of these metrics and demonstrate that they are optimal in many settings. For 12 frequently used real networks, we apply the proposed GoF metrics and find that DCMM fits well with almost all of them, whereas SBM, DCBM, and MMSBM fail to fit many of these networks, particularly when the networks are relatively large. We also show that DCMM is nearly as broad as the rank-K network model. Based on these results, we recommend DCMM as a promising model for undirected networks.

Keywords

Network analysis

Goodness-of-Fit

Block model

Community detection

Mixed membership

Cycle-Count statistics 

Main Sponsor

Section on Statistical Learning and Data Science