Statistical-computational Trade-offs for Recursive Adaptive Partitioning Estimators

Jason Klusowski Speaker
Princeton University
 
Wednesday, Aug 6: 8:30 AM - 10:20 AM
Topic-Contributed Paper Session 
Music City Center 
Recursive adaptive partitioning estimators, like decision trees and their ensembles, are effective for high-dimensional regression but usually rely on greedy training, which can become stuck at suboptimal solutions. We study this phenomenon in estimating sparse regression functions over binary features, showing that when the true function satisfies a certain structural property—Abbe et al. (2022)'s Merged Staircase Property (MSP)—greedy training achieves low estimation error with only a logarithmic number of samples in the feature count. In contrast, when MSP is absent, estimation becomes exponentially more difficult. Interestingly, this dichotomy between efficient and inefficient estimation resembles the behavior of two-layer neural networks trained with SGD in the mean-field regime. Meanwhile, ERM-trained recursive adaptive partitioning estimators achieve low estimation error with logarithmically many samples, regardless of MSP, revealing a fundamental statistical-computational trade-off for greedy training.

Keywords

decision trees, random forests, neural networks, greedy algorithms, gradient descent