Non-parametric Adaptive Estimation of Transition Kernels of Controlled Markov Chains

Abstract Number:

3474 

Submission Type:

Contributed Abstract 

Contributed Abstract Type:

Paper 

Participants:

Imon Banerjee (1)

Institutions:

(1) Northwestern University, N/A

First Author:

Imon Banerjee  
Northwestern University

Presenting Author:

Imon Banerjee  
Purdue University

Abstract Text:

A controlled Markov chain (CMC) is a paired process which constitute a Markovian state and a non-Markovian control. The control is a random variable which chooses a transition kernel and the state transitions according to that transition kernel. The recent popularity of model-based offline reinforcement learning has made learning this transition kernel (a.k.a. "model") an important open question. This talk aims to address that through the lenses of an adaptive, non-parametric, estimator. In particular, we will pose the estimator as a solution to a constrained minimax-optimisation problem and explore its finite sample risk bounds. We will also connect it to recent developments in the theory of model selection. Finally we will discuss some examples which illustrate the applicability of our setup on downstream estimation tasks.

Keywords:

Markov chain|Controlled Markov Chain|Non-parametric estimation|Adaptive-estimation|besov-classes|optimisation

Sponsors:

IMS

Tracks:

Statistical Theory

Can this be considered for alternate subtype?

Yes

Are you interested in volunteering to serve as a session chair?

Yes

I have read and understand that JSM participants must abide by the Participant Guidelines.

Yes

I understand that JSM participants must register and pay the appropriate registration fee by June 1, 2024. The registration fee is non-refundable.

I understand