Online Personalized Policy Learning Using Mobile Health Data

Min Qian Speaker
Columbia University
 
Sunday, Aug 3: 2:30 PM - 2:55 PM
Invited Paper Session 
Music City Center 
With the increasing focus on improving personal health and fitness using smart devices and wearables, it is crucial to create a mobile clinical decision support system. In this work, we consider the development of personalized policies that allow different intervention recommendations for individuals with the same observed features. Personalized policy represents a paradigm shift from one decision rule for all users to an individualized decision rule for each user. Aiming to optimize the expected rewards, we propose using a generalized linear mixed modeling framework where population effects and individual deviations from the population effects are modeled as fixed and random effects, respectively, and synthesized to form the personalized policy. We introduce a contextual bandit algorithm to learn the personalized policies. This approach is theoretically justified using a regret bound and illustrated using mobile Apps with the goal of maximizing the push notification response rate given past app usage and other contextual factors.

Keywords

contextual bandits

generalization error bound