Properties and Applications of Feature Whitening

Conference: Women in Statistics and Data Science 2022
10/07/2022: 3:00 PM - 3:30 PM CDT
Concurrent 
Room: Grand Ballroom Salon F 

Presentations

Description

Strong correlations among features are well-known hurdles for existing selection/screening methods, but common across various domains. We explore several properties of a pre-processing step called ZCA whitening to transform features, which we and others have shown can greatly improve accuracy in certain selection procedures. However, this whitening method induces complete decorrelation at the cost of similarity with the original set of predictors and thus, interpretability. We propose a more general technique, ORTHOMAP, that allows one to directly control the level of collinearity permitted among features in order to strengthen the mapping between original and transformed variables. We show this approach can be formulated as a second order conic program (SOCP), and its connection with ZCA. We demonstrate the benefits and drawbacks of ORTHOMAP along with other decorrelation procedures through numerical experiments and a real data application concerning COVID-19 mortality curves in regions across Italy. These experiments also highlight an important aspect of ZCA and ORTHOMAP, the ability to be utilized across different modeling techniques and/or response structures.

Keywords

Feature decorrelation

Variable selection

Functional data analysis 

Presenting Author

Ana Kenney

First Author

Ana Kenney

CoAuthor

Francesca Chiaromonte, Penn State University

Target Audience

Mid-Level

Tracks

Knowledge
Women in Statistics and Data Science 2022