δ-Invariant Feature Selection: Stable Signal Recovery Across Heterogeneous Domains
Wednesday, Aug 5: 10:30 AM - 12:20 PM
Topic-Contributed Paper Session
We aim to identify features that predict an outcome of interest across multiple datasets. In particular, we study how to recover a subset of features whose relationship with the outcome generalizes to an unseen future population. We study this feature selection problem under sparse linear models, allowing for shifts in the conditional distribution Y|X across domains. We propose δ-invariant feature selection, which selects features whose estimated coefficients are sign-consistent across datasets and whose strength exceeds a fixed δ-relevance threshold. Through empirical examples and theoretical analysis, we study conditions under which the proposed procedure consistently recovers the population δ-invariant feature set and produces a feature set with small out-of-distribution prediction error.
Distribution shift
Sparse regression
Invariance
Variable selection
Out-of-distribution prediction
You have unsaved changes.