Doubly Robust Conditional Independence Testing with Generative Neural Networks

Yi Zhang Co-Author
 
Linjun Huang Co-Author
UIUC
 
Yun Yang Co-Author
University of Illinois Urbana-Champaign
 
Xiaofeng Shao Co-Author
Washington University in St Louis, Dept of Statistics and Data Science
 
Yi Zhang Speaker
 
Tuesday, Aug 5: 11:05 AM - 11:20 AM
Topic-Contributed Paper Session 
Music City Center 
This article addresses the problem of testing the conditional independence of two generic random vectors X and Y given a third random vector Z, which plays an important role in statistical and machine learning applications. We propose a new non-parametric testing procedure that avoids explicitly estimating any conditional distributions but instead requires sampling from the two marginal conditional distributions of X given Z and Y given Z. We further propose using a generative neural network (GNN) framework to sample from these approximated marginal conditional distributions, which tends to mitigate the curse of dimensionality due to its adaptivity to any low-dimensional structures and smoothness underlying the data. Theoretically, our test statistic is shown to enjoy a double robustness property against GNN approximation errors, meaning that the test statistic retains all desirable properties of the oracle test statistic utilizing the true marginal conditional distributions, as long as the product of the two approximation errors decays to zero faster than the parametric rate. Asymptotic properties of our statistic and the consistency of a bootstrap procedure are derived under both null and local alternatives. Extensive numerical experiments and real data analysis illustrate the effectiveness and broad applicability of our proposed test.

Keywords

Conditional Distribution

Conditional Independence Test

Double Robustness

Generative Models

Kernel Method

Maximum Mean Discrepancy