Non-parametric Adaptive Estimation of Transition Kernels of Controlled Markov Chains
Tuesday, Aug 6: 2:35 PM - 2:50 PM
3474
Contributed Papers
Oregon Convention Center
A controlled Markov chain (CMC) is a paired process which constitute a Markovian state and a non-Markovian control. The control is a random variable which chooses a transition kernel and the state transitions according to that transition kernel. The recent popularity of model-based offline reinforcement learning has made learning this transition kernel (a.k.a. "model") an important open question. This talk aims to address that through the lenses of an adaptive, non-parametric, estimator. In particular, we will pose the estimator as a solution to a constrained minimax-optimisation problem and explore its finite sample risk bounds. We will also connect it to recent developments in the theory of model selection. Finally we will discuss some examples which illustrate the applicability of our setup on downstream estimation tasks.
Markov chain
Controlled Markov Chain
Non-parametric estimation
Adaptive-estimation
besov-classes
optimisation
Main Sponsor
IMS
You have unsaved changes.