Thursday, Aug 6: 10:30 AM - 12:20 PM
1500
Topic-Contributed Paper Session
Thomas M. Menino Convention & Exhibition Center
Room: CC-252B
Applied
No
Main Sponsor
IMS
Co Sponsors
International Indian Statistical Association
International Society for Bayesian Analysis (ISBA)
Presentations
In any Generative Model, the generated samples can be distributed according to a different distribution than the \emph{data distribution}, due to inevitable learning errors. Moreover, this discrepancy, and metrics for evaluating the generated samples, are hard to characterize in high dimensions. In this work, we explore a method to approximate the stochastic Koopman operator of the OU process and use this approximation as a lazy way to generate new samples. Although such an approximation does not produce the target probability distribution, it is amenable to adapt to learn certain features, e.g., sampling from the same support, of the target. This is joint work with Georg Gottwald (U Sydney).
Diffusion language models (DLMs) have emerged as a compelling alternative to autoregressive (AR) models by enabling parallel, non-sequential token generation. Despite their strong empirical performance, the theoretical understanding of how decoding strategies affect sampling efficiency remains limited. In this talk, we present recent theoretical advances showing that DLM sampling can be substantially accelerated through decoding strategies that exploit low-dimensional structure in the target data distribution. We establish convergence guarantees for both uniform and confidence-based decoding strategies, proving that high-quality samples can be generated in a sublinear number of iterations. In particular, the iteration complexity depends on information-theoretic quantities that capture the intrinsic complexity of the target distribution. These results provide a theoretical foundation for efficient diffusion-based language generation and offer principled insights into decoding strategy design.
I will present some findings from recent works (with coauthors)
on the connections between MCMC and density estimation/
generative modeling.
Diffusion models and flow-based methods have shown impressive generative capability, especially for images, but their sampling is expensive because it requires many iterative updates. We introduce W-Flow, a framework for training a generator that transforms samples from a simple reference distribution into samples from a target data distribution in a single step. This is achieved in two steps: we first define an evolution from the reference distribution to the target distribution through a Wasserstein gradient flow that minimizes an energy functional; second, we train a static neural generator to compress this evolution into one-step generation. We instantiate the energy functional with the Sinkhorn divergence, which yields an efficient optimal-transport-based update rule that captures global distributional discrepancy and improves coverage of the target distribution. We further prove that the finite-sample training dynamics converge to the continuous-time distributional dynamics under suitable assumptions. Empirically, W-Flow sets a new state of the art for one-step ImageNet 256x256 generation, achieving 1.29 FID, with improved mode coverage and domain transfer. Compared to multi-step diffusion models with similar FID scores, our method yields approximately 100x faster sampling. These results show that Wasserstein gradient flows provide a principled and effective foundation for fast and high-fidelity generative modeling.