Efficient Generative Modeling via Penalized Optimal Transport Network

Chenyang Zhong Co-Author
Department of Statistics, Columbia University
 
Wing-Hung Wong Co-Author
Stanford University
 
Wenhui Sophia Lu First Author
Stanford University
 
Wenhui Sophia Lu Presenting Author
Stanford University
 
Tuesday, Aug 5: 10:50 AM - 11:05 AM
1306 
Contributed Papers 
Music City Center 
Synthetic data generation plays a critical role across scientific disciplines, from systematic model evaluation to augmenting limited datasets. While Wasserstein Generative Adversarial Networks have shown promise in this area, they are susceptible to mode collapse. This limitation results in generated samples that neglect critical aspects of the true data distribution––particularly its tails and minor modes––thus undermining downstream analyses and jeopardizing reliable decision-making. To address these challenges, we introduce the Penalized Optimal Transport Network (POTNet), a novel deep generative model that provably mitigates mode collapse. POTNet leverages a robust and interpretable Marginally-Penalized Wasserstein loss to steer the alignment of joint distributions. Moreover, our primal-based framework eliminates the need for a critic network, thereby circumventing the instabilities of adversarial training and obviating extensive hyperparameter tuning. Through both theoretical analysis and comprehensive empirical evaluation, we demonstrate that POTNet effectively attenuates mode collapse and substantially outperforms existing methods in accurately recovering complex underlying data structures.

Keywords

mode collapse

synthetic data generation

marginal penalization

marginal regularization

generative density estimation

Wasserstein distance 

Main Sponsor

Section on Statistical Learning and Data Science