Monday, Aug 4: 10:30 AM - 12:20 PM
0152
Invited Paper Session
Music City Center
Room: CC-105B
Applied
No
Main Sponsor
IMS
Co Sponsors
International Chinese Statistical Association
Section on Statistical Learning and Data Science
Presentations
Conformal prediction is a framework for predictive inference with a distribution-free, finite-sample guarantee. However, it tends to provide uninformative prediction sets when calibration data are scarce. This paper introduces Synthetic-powered predictive inference ($\scp$), a novel framework that incorporates synthetic data---e.g., from a generative model---to improve sample efficiency. At the core of our method is a score transporter: an empirical quantile mapping that aligns nonconformity scores from trusted, real data with those from synthetic data. By carefully integrating the score transporter into the calibration process, $\scp$ provably achieves finite-sample coverage guarantees without making any assumptions about the real and synthetic data distributions. When the score distributions are well aligned, $\scp$ yields substantially tighter and more informative prediction sets than standard conformal prediction. Experiments on image classification---augmenting data with synthetic diffusion-model generated images---and on tabular regression demonstrate notable improvements in predictive efficiency in data-scarce settings.
Keywords
deep learning
In this talk, we revisit random feature ridge regression (RFRR), a model that has recently gained renewed interest for investigating puzzling phenomena in deep learning—such as double descent, benign overfitting, and scaling laws. Our main contribution is a general deterministic equivalent for the test error of RFRR. Specifically, under a certain concentration property, we show that the test error is well approximated by a closed-form expression that only depends on the feature map eigenvalues. Notably, our approximation guarantee is non-asymptotic, multiplicative, and independent of the feature map dimension—allowing for infinite-dimensional features.
This deterministic equivalent can be used to precisely capture the above phenomenology in RFRR. As an example, we derive sharp excess error rates under standard power-law assumptions of the spectrum and target decay. In particular, we tightly characterize the optimal parametrization achieving minimax rate.
This is based on joint work with Basil Saeed (Stanford), Leonardo Defilippis (ENS), and Bruno Loureiro (ENS).
Keywords
Random Feature Regression
Random Matrix Theory
Scaling laws
Benign overfitting
Recent empirical studies have shown that diffusion models possess a unique reproducibility property, transiting from memorization to generalization as the number of training samples increases. This demonstrates that diffusion models can effectively learn image distributions and generate new samples. Remarkably, these models achieve this even with a small number of training samples, despite the challenge of large image dimensions, effectively circumventing the curse of dimensionality. In this work, we provide theoretical insights into this phenomenon by leveraging two key empirical observations: (i) the low intrinsic dimensionality of image datasets and (ii) the low-rank property of the denoising autoencoder in trained diffusion models. With these setups, we rigorously demonstrate that optimizing the training loss of diffusion models is equivalent to solving the canonical subspace clustering problem across the training samples. This insight has practical implications for training and controlling diffusion models. Specifically, it enables us to precisely characterize the minimal number of samples necessary for accurately learning the low-rank data support, shedding light on the phase transition from memorization to generalization. Additionally, we empirically establish a correspondence between the subspaces and the semantic representations of image data, which enables one-step, transferrable, efficient image editing. Moreover, our results have profound practical implications for training efficiency and model safety, and they also open up numerous intriguing theoretical questions for future research.
Keywords
diffusion models
low-dimensionality
distribution learning
Speaker
Qing Qu, University of Michigan