Low-Rank Online Dynamic Assortment with Dual Contextual Information

Will Wei Sun Co-Author
Purdue University
 
Yufeng Liu Co-Author
University of North Carolina at Chapel Hill
 
Seong Jin Lee First Author
University of North Carolina at Chapel Hill
 
Seong Jin Lee Presenting Author
University of North Carolina at Chapel Hill
 
Monday, Aug 4: 2:50 PM - 3:05 PM
1417 
Contributed Papers 
Music City Center 
As e-commerce expands, delivering real-time personalized recommendations from vast catalogs poses a critical challenge for retail platforms. Maximizing revenue requires careful consideration of both individual customer characteristics and available item features to optimize assortments over time. In this paper, we consider the dynamic assortment problem with dual contexts -- user and item features. In high-dimensional scenarios, the quadratic growth of dimensions complicates computation and estimation. To tackle this challenge, we introduce a new low-rank dynamic assortment model to transform this problem into a manageable scale. Then we propose an efficient algorithm that estimates the intrinsic subspaces and utilizes the upper confidence bound approach to address the exploration-exploitation trade-off in online decision making. Theoretically, we establish a regret bound with substantial improvement over prior literature, made possible by leveraging the low-rank structure. Extensive simulations and an application to the Expedia hotel recommendation dataset further demonstrate the advantages of our proposed method.

Keywords

Bandit Algorithm

Low-rankness

Online Decision Making

Reinforcement Learning 

Main Sponsor

Section on Statistical Learning and Data Science