New Statistical Methods for Network Data Analysis

Chang Su Chair
Emory University
 
Emma Jingfei Zhang Organizer
Emory University
 
Sunday, Aug 4: 4:00 PM - 5:50 PM
1609 
Topic-Contributed Paper Session 
Oregon Convention Center 
Room: CC-D135 

Applied

No

Main Sponsor

Section on Statistical Computing

Co Sponsors

IMS
Section on Statistical Graphics

Presentations

Modularity Based Methods for Network Data

We introduce several network modularity measures for both single-layer and multi-layer networks under different null models of the network, motivated by empirical observations in networks from a diverse field of applications. We describe a statistical framework for modularity-based network community detection. The effectiveness of the proposed methods is demonstrated in simulated and real networks. 

Speaker

Yuguo Chen, University of Illinois at Urbana-Champaign

Measurement error and network homophily in autoregressive models of peer effects

The autoregressive models of peer effects include the SAR model in cross sectional studies used to estimate the peer influence and the effects of covariates taking network dependence into account, and the longitudinal model to causally identify the effect of peer actions in the preceding time period. We investigate issues of measurement error and network homophily in both of these setups.
First, we investigate causal peer role model effect on successful graduation from Therapeutic Communities (TCs) for substance abuse using records of exchanges among residents and their entry and exit dates which allowed us to form peer networks and define a causal estimand. To identify peer influence in the presence of unobserved homophily, we model the network with a latent variable model and show that our peer influence estimator is asymptotically unbiased. Second, in the context of SAR model, while the model can be estimated with a QMLE approach, the detrimental effect of covariate measurement error on the QMLE and how to remedy it is currently unknown. We develop a measurement error-corrected ML estimator and show that it possesses statistical consistency and asymptotic normality properties. 

Speaker

Subhadeep Paul, The Ohio State University

Homophily-adjusted social influence estimation

Homophily and social influence are two key concepts of social network analysis. Distinguishing between these phenomena is difficult, and approaches to disambiguate the two have been primarily limited to longitudinal data analyses. In this study, we provide sufficient conditions for valid estimation of social influence through cross-sectional data, leading to a novel homophily-adjusted social influence model which addresses the backdoor pathway of latent homophilic features. The oft-used network autocorrelation model (NAM) is the special case of our proposed model with no latent homophily, suggesting that the NAM is only valid when all homophilic attributes are observed. To assess the performance of our model, we conducted a comprehensive simulation study, comparing its results to other methods designed for cross-sectional data. Our findings shed light on the nuanced dynamics of social networks, presenting a valuable tool for researchers seeking to estimate the effects of social influence while accounting for homophily. 

Speaker

Daniel Sewell, University of Iowa

Modeling networks with textual edges

Edges in many real-world networks are associated with rich text information, such as email communications between accounts and interactions between social media users. To better account for the rich text information, we propose a new latent space network model that treats texts as embedded vectors. We establish a set of identifiability conditions for the proposed model and formulate a projected gradient descent algorithm for model estimation. We further investigate theoretical properties of the iterates from the proposed algorithm. The efficacy of our method is demonstrated through simulations and an analysis of the Enron email dataset. 

Speaker

Emma Jingfei Zhang, Emory University