Print Close

Minimax Rate Optimal Algorithms For High-Dimensional Stochastic Linear Bandits

Presented During: Advances in Adaptive Experimentation and Anytime-Valid Inference

Yanglei Song Speaker
Queen's University

Sunday, Aug 3: 2:05 PM - 2:25 PM
Topic-Contributed Paper Session

Music City Center

We consider the stochastic linear bandit problem with several arms over T rounds, where the dimension d of the covariate vectors is potentially larger than T, although each arm-specific parameter vector has at most s non-zero components. First, we study the single-arm case, focusing on the cumulative estimation error of the parameter vector. We show that Lasso has a sub-optimal dependence on d and T in terms of worst-case error in the online estimation setup. Further, we establish the minimax optimality of the thresholded Lasso, which first estimates the support of the parameter vector by thresholding the Lasso estimator and then runs least squares on the selected support. Second, we consider the bandit setup, using the thresholded Lasso as the main estimation method. We propose a three-stage algorithm for arm selection and derive an upper bound of order s(log(d)+log(T))log(s) on its cumulative regret. Finally, we establish a matching lower bound up to a log(s) multiplicative term, which shows the near minimax optimality of the proposed method. We note that our analysis avoids the restrictive "beta-min" condition.

Keywords

Stochastic Linear Bandits

High-dimensional Linear Models

Minimax Rate Optimality