A Minimax Approach for Optimal Incentive Policy Learning
Yue Liu
Co-Author
School of Statistics, Renmin University of China
Hao Mei
Co-Author
School of Statistics, Renmin University of China
Chenyang Li
First Author
School of Statistics, Renmin University of China
Hao Mei
Presenting Author
School of Statistics, Renmin University of China
Sunday, Aug 3: 2:05 PM - 2:20 PM
1767
Contributed Papers
Music City Center
While strategic incentive policies are essential in personalized services, variations in user responses and revenue potentials create significant challenges in identifying optimal incentive policies. Existing literature typically falls short in fully identification of all user types and often assumes a uniform conversion revenue, leading to inefficient incentive allocations. In this study, we propose a minimax policy learning approach within a counterfactual principal strata framework. A value function, accommodating varying rewards across six potentially non-identifiable principal strata, is designed to minimize the worst-case value loss relative to three alternative policies: never-treat, always-treat, and oracle. To learn an optimal policy, we introduce three estimators: Principal Outcome Regression (P-OR), Principal Inverse Propensity Scoring (P-IPS), and Principal Doubly Robust (P-DR), providing theoretical guarantees for their unbiasedness, robustness, and regret upper bound. Extensive numerical experiments validate the effectiveness and superiority of our proposed approach.
Policy learning
Minimax
Counterfactual principal strata
Partial identification
Causal inference
Main Sponsor
International Chinese Statistical Association
You have unsaved changes.