Simultaneous Community Detection and Missing Data Imputation for Networks with Node Covariates

Katherine McLaughlin Co-Author
Oregon State University
 
James Molyneux Co-Author
Swyfft, LLC
 
Gauri Phatak First Author
Oregon State University
 
Gauri Phatak Presenting Author
Oregon State University
 
Monday, Aug 5: 12:05 PM - 12:20 PM
3847 
Contributed Papers 
Oregon Convention Center 

Description

It is challenging to perform analysis of social network data including community detection when there are missing values including node covariates, entire nodes, and edges. Node covariates provide an additional resource for network community detection in addition to the structure of the network. We propose an iterative method to simultaneously update missing covariates using imputation and perform covariate assisted community detection for networks modelled using Exponential Random Graph Models (ERGMs).
The proposed model is assessed using simulated network data with known communities and covariate values. In addition to simulated networks, time series of networks are generated based on human movement between Oregon cities that participated in a wastewater surveillance program run by a team at Oregon State University since mid-2020. The COVID wastewater data along with demographic and other COVID metrics are considered node covariates. Some of these covariates are assumed missing at random(MAR).Wastewater-based epidemiology is an effective approach to monitor the presence, prevalence, and trend of diseases, and understanding their spread through human movement networks.

Keywords

Network community detection


Wastewater based epidemiology


Time series networks

Network Missing data imputation

Human movement network

Disease spread in human movement network 

Main Sponsor

Health Policy Statistics Section