δ-Invariant Feature Selection: Stable Signal Recovery Across Heterogeneous Domains

Ivy Zhang Speaker
 
Wednesday, Aug 5: 10:30 AM - 12:20 PM
Topic-Contributed Paper Session 
We aim to identify features that predict an outcome of interest across multiple datasets. In particular, we study how to recover a subset of features whose relationship with the outcome generalizes to an unseen future population. We study this feature selection problem under sparse linear models, allowing for shifts in the conditional distribution Y|X across domains. We propose δ-invariant feature selection, which selects features whose estimated coefficients are sign-consistent across datasets and whose strength exceeds a fixed δ-relevance threshold. Through empirical examples and theoretical analysis, we study conditions under which the proposed procedure consistently recovers the population δ-invariant feature set and produces a feature set with small out-of-distribution prediction error.

Keywords

Distribution shift

Sparse regression

Invariance

Variable selection

Out-of-distribution prediction