Semi-Parametric Batched Global Multi-Armed Bandits with Covariates
Sakshi Arya
First Author
Case Western Reserve University
Sakshi Arya
Presenting Author
Case Western Reserve University
Monday, Aug 4: 3:35 PM - 3:50 PM
1795
Contributed Papers
Music City Center
In applications such as clinical trials, treatment decisions are usually made in phases/batches, where information from the previous batch is used to determine the treatments allocated in the upcoming batch. Such scenarios can naturally be seen to fall in the batched bandits framework. While batched bandit frameworks have been studied in parametric and nonparametric regression settings, we propose a novel semi-parametric bandit approach that promotes interpretability and dimension reduction in nonparametric batched bandits. We assume that the reward-covariate relationship can be modelled in a reduced 1-dimensional central subspace based on the single-index regression framework. We adopt an adaptive binning and successive elimination algorithm and provide optimal regret guarantees for the same. We also illustrate the performance of the algorithm on simulated and real datasets.
multi-armed bandits
semi-parametric
single-index regression
dynamic binning
successive elimination
regret bounds
Main Sponsor
Section on Statistical Learning and Data Science
You have unsaved changes.