Print Close

δ-Invariant Feature Selection: Stable Signal Recovery Across Heterogeneous Domains

Presented During: Navigating High-Dimensional Landscapes: Innovations in Model Estimation and Predictive Inference

Ivy Zhang Speaker

Wednesday, Aug 5: 10:30 AM - 12:20 PM
Topic-Contributed Paper Session

We aim to identify features that predict an outcome of interest across multiple datasets. In particular, we study how to recover a subset of features whose relationship with the outcome generalizes to an unseen future population. We study this feature selection problem under sparse linear models, allowing for shifts in the conditional distribution Y|X across domains. We propose δ-invariant feature selection, which selects features whose estimated coefficients are sign-consistent across datasets and whose strength exceeds a fixed δ-relevance threshold. Through empirical examples and theoretical analysis, we study conditions under which the proposed procedure consistently recovers the population δ-invariant feature set and produces a feature set with small out-of-distribution prediction error.

Keywords

Distribution shift

Sparse regression

Invariance

Variable selection

Out-of-distribution prediction