Interpretable network-assisted prediction

Elizaveta Levina Speaker
University of Michigan
 
Tuesday, Aug 6: 9:15 AM - 9:35 AM
Invited Paper Session 
Oregon Convention Center 
Machine learning algorithms often assume that training samples are independent. When data points are connected by a network, it creates dependency between samples, which is a challenge, reducing effective sample size, and an opportunity to improve prediction by leveraging information from network neighbors. Multiple prediction methods taking advantage of this opportunity are now available. Many methods including graph neural networks are not easily interpretable, limiting their usefulness in the biomedical and social sciences, where understanding how a model makes its predictions is often more important than the prediction itself. Some are interpretable, for example, network-assisted linear regression, but generally do not achieve similar prediction accuracies as more flexible models. We bridge this gap by proposing a family of flexible network-assisted models built upon a generalization of random forests (RF+), which both achieves highly-competitive prediction accuracy and can be interpreted through feature importance measures. In particular, we provide a suite of novel interpretation tools that enable practitioners to not only identify important features that drive model predictions, but also quantify the importance of the network contribution to prediction. This suite of general tools broadens the scope and applicability of network-assisted machine learning for high-impact problems where interpretability and transparency are essential. This is joint work with Tiffany Tang and Ji Zhu.