Advances in Generative Models

Osafu Augustine Egbon Chair
National Institute of Environmental Health Sciences
 
Tuesday, Aug 5: 10:30 AM - 12:20 PM
4096 
Contributed Papers 
Music City Center 
Room: CC-103A 

Main Sponsor

Section on Statistical Learning and Data Science

Presentations

AI-Generated Images of Cancer Patients: Comparing the Results of Two Generative AI Models

Health communicators can use generative AI tools to create images for use in stakeholder-facing materials. This study examines the differences in two image-generation tools (DALL-E and Stable Diffusion) to understand how each tool portrays individuals with cancer.
Images (n = 303) generated by each tool using the prompts: "cancer patient", "breast cancer patient", "lung cancer patient", "prostate cancer patient", "cancer survivor", and "person with cancer" were coded for photorealism and the presence of rendering errors present in the image, like extra hands or misspelled words. Most of these images were coded as photorealistic (79.5%, n = 241) and without significant rendering errors (84.2%, n = 255). Stable Diffusion was more likely to produce a photorealistic result (66.4%, n = 160) while DALL-E more often produced images without errors (53.3%, n = 136). Images produced with Stable Diffusion more often produced images with the person lying in bed, wearing a hospital gown, and with a sick appearance compared to images generated by DALL-E.
Understanding how generative AI tools portray individuals with cancer is an important step in using these tools in communications. 

Keywords

AI-generated images

cancer patients

representation

ChatGPT

Stable Diffusion

visual content analysis 

Co-Author(s)

Sylvia Chou, National Cancer Institute
Anna Gaysynsky, National Cancer Institute
Irina Iles, National Cancer Institute
Abigail Muro, National Cancer Institute
Nicole Senft, National Cancer Institute

First Author

Kristin Schrader, Westat

Presenting Author

Kristin Schrader, Westat

Efficient Generative Modeling via Penalized Optimal Transport Network

Synthetic data generation plays a critical role across scientific disciplines, from systematic model evaluation to augmenting limited datasets. While Wasserstein Generative Adversarial Networks have shown promise in this area, they are susceptible to mode collapse. This limitation results in generated samples that neglect critical aspects of the true data distribution––particularly its tails and minor modes––thus undermining downstream analyses and jeopardizing reliable decision-making. To address these challenges, we introduce the Penalized Optimal Transport Network (POTNet), a novel deep generative model that provably mitigates mode collapse. POTNet leverages a robust and interpretable Marginally-Penalized Wasserstein loss to steer the alignment of joint distributions. Moreover, our primal-based framework eliminates the need for a critic network, thereby circumventing the instabilities of adversarial training and obviating extensive hyperparameter tuning. Through both theoretical analysis and comprehensive empirical evaluation, we demonstrate that POTNet effectively attenuates mode collapse and substantially outperforms existing methods in accurately recovering complex underlying data structures.
 

Keywords

mode collapse

synthetic data generation

marginal penalization

marginal regularization

generative density estimation

Wasserstein distance 

Co-Author(s)

Chenyang Zhong, Department of Statistics, Columbia University
Wing-Hung Wong, Stanford University

First Author

Wenhui Sophia Lu, Stanford University

Presenting Author

Wenhui Sophia Lu, Stanford University

Generative Transformer for Longitudinal Biomarker and Diet Quality Data Representation

Modeling multidimensional longitudinal RCT data is inherently complex due to temporal dependencies, missing values, and dynamic variations in behavioral responses and outcomes over time. Traditional analysis methods often fall short in capturing the intricate temporal and group-specific patterns present in such datasets. To overcome these limitations, we introduce MITransformer, a generative pretrained transformer framework enhanced with multiple imputations for robust contextual representation learning from longitudinal biomarker data, incorporating diet quality measurements. MITransformer reconstructs input features across time points, effectively capturing temporal patterns and inter-variable relationships, while addressing the issue of missing data through multiple imputation. By applying dynamically scaled positional embeddings within the attention mechanism, the model preserves temporal relationships without distorting continuous data distributions. A gated integration mechanism selectively emphasizes input subsets, allowing the model to differentiate the importance of various input types. The contextual embeddings generated by MITransformer improve representation quality across time, facilitating better clustering and regression/classification outcomes. Our results demonstrate that these embeddings preserve biological and behavioral variation, enabling the model to distinguish between demographic subgroups such as gender without the need for explicit labels. This approach enhances interpretability and analytical performance, laying the foundation for advanced applications such as digital twins, individualized health monitoring, and diet-related outcome prediction, thereby expanding the capabilities of conventional disease diagnosis and prognosis using biomarker data. 

Keywords

Biomarker

Diet Quality Index

Contextual Representation

Longitudinal Data Modeling

Generative Pretrained Transformer

Multiple Imputation 

Co-Author(s)

Hua Fang
Honggang Wang, Yeshiva University

First Author

Ashikur Nobel, Yeshiva University

Presenting Author

Ashikur Nobel, Yeshiva University

Parallelly Tempered Generative Adversarial Networks

A generative adversarial network (GAN) has become a cornerstone of generative AI for its ability to model complex data-generating processes. However, GAN training is notoriously unstable, often suffering from mode collapse. This work analyzes training instability through the variance of gradients, linking it to multimodality in the target distribution. To address these issues, we propose a novel GAN training framework that uses tempered distributions via convex interpolation. With a new GAN objective, the generator learns all tempered distributions simultaneously, akin to parallel tempering in statistics. Simulations demonstrate the superiority of our method over existing strategies in synthesizing image and tabular data. We theoretically show that this improvement stems from reduced gradient variance using tempered distributions. Additionally, we develop a variant of our framework to generate fair synthetic data, addressing a growing concern in trustworthy AI. 

Keywords

Generative Adversarial Network

Parallel Tempering

Fair Data Generation


Variance Reduction of Gradients 

Co-Author

Qifan Song

First Author

Jinwon Sohn, Purdue University

Presenting Author

Jinwon Sohn, Purdue University

Relaxed χ^2-Divergence Gradient Flow

Transporting samples from a source to a target distribution, given only finite samples from both, is a fundamental problem in machine learning, with applications in generative modeling and variational inference. We address this problem by approximating a discretized gradient flow of the MMD-regularized $\chi^2$-divergence between the evolving source and the fixed target distribution. We provide non-asymptotic error bounds for (i) optimization error (measuring convergence to the target distribution), (ii) sampling error (from finite to infinite sample size), and (iii) approximation error (due to regularization), with particular attention to their dependence on dimensionality. Our minimization scheme admits closed-form updates and employs a data-adaptive annealed regularization strategy to maximize descent. Experiments on tabular and vision datasets demonstrate the effectiveness of our approach. 

Keywords

gradient flows

convex analysis

$\chi^2$-divergence

generative modeling

Wasserstein space 

Co-Author(s)

Garrett Mulcahy, University of Washington
Soumik Pal, University of Washington
Zaid Harchaoui, University of Washington

First Author

Medha Agarwal, University of Washington

Presenting Author

Medha Agarwal, University of Washington

Vector representations of generative models and their consistent estimation

Generative models, like large language models or text-to-image diffusion models, can generate a random output or response after being given a query from a user. Representing them with vectors in a finite-dimensional Euclidean space based on their responses to a set of queries, facilitates statistical decision-making tasks on black-box generative models using conventional tools. We establish sufficient conditions for consistent estimation of population-level vector representations of a set of generative models based on their sample responses to a set of queries. 

Keywords

generative models

multidimensional scaling

raw stress embedding 

Co-Author(s)

Michael W. Trosset, Indiana University Bloomington
Carey E. Priebe, Johns Hopkins University
Hayden S. Helm, Helivan Research

First Author

Aranyak Acharyya, Johns Hopkins University

Presenting Author

Aranyak Acharyya, Johns Hopkins University

Selective Inference for Multivariate Regression Trees

We consider post-selection inference for regression trees when the response is multivariate. In particular, we study how to appropriately test hypotheses suggested by the fitted tree. We find, as is known when the response is univariate, that to control the Type I error rate one must condition on the recursive data splits leading to the hypothesis in question. One may wish, e.g., to test whether the populations represented by two sibling nodes have the same mean. With a univariate response, proper conditioning on the splits results in a truncation of the null distribution of the test statistic such that p-values must be computed with respect to truncated normal distributions. With a multivariate response, we find that the p-values must be computed with respect to truncated multivariate normal distributions, where the truncation set is defined by a list of quadratic constraints. We show that accept-reject Monte Carlo simulation can give reliable post-selection p-values with a bivariate response and a fairly small number of predictors. To accommodate more predictors, we must consider more efficient ways to obtain probabilities from truncated multivariate Normal distributions. 

Keywords

post-selection inference

regression tree

MCMC 

Co-Author

Karl Gregory, Academic Advisor

First Author

Le Chang

Presenting Author

Le Chang