Print Close

Layered Models can "Automatically" Discover Low-Dimensional Structures via Feature Learning

Presented During: Structure Identification and Dimension Reduction Methods

Yang Li Co-Author
Massachusetts Institute of Technology

Keli Liu Co-Author

Feng Ruan Co-Author
Northwestern University

Yunlu Chen First Author
Northwestern University

Yunlu Chen Presenting Author
Northwestern University

Thursday, Aug 7: 11:35 AM - 11:50 AM
1382
Contributed Papers

Music City Center

Layered models like neural networks appear to extract key features from data through empirical risk minimization, yet the theoretical understanding for this process remains unclear. Motivated by these observations, we study a two-layer nonparametric regression model where the input undergoes a linear transformation followed by a nonlinear mapping to predict the output, mir- roring the structure of two-layer neural networks. In our model, both layers are optimized jointly through empirical risk minimization, with the nonlinear layer modeled by a reproducing kernel Hilbert space induced by a rotation and translation invariant kernel, regularized by a ridge penalty.
Our main result shows that the two-layer model can "automatically'' induce regularization and facilitate feature learning. Specifically, the two-layer model promotes dimensionality reduction in the linear layer and identifies a parsimonious subspace of relevant features-even without applying any norm penalty on the linear layer. Notably, this regularization effect arises directly from the model's layered structure. Real-world data experiments further demonstrate the persistence of this phenomenon in practice.

Keywords

layered models

regularization

feature learning

central mean subspace

reproducing kernel Hilbert space

ridge regression

Main Sponsor

Section on Statistical Learning and Data Science