Conformal Prediction and Conformal Inference

Jiawei Shan Chair
University of Wisconsin-Madison
 
Thursday, Aug 6: 8:30 AM - 10:20 AM
6242 
Contributed Papers 
Thomas M. Menino Convention & Exhibition Center 
Room: CC-102B 

Main Sponsor

Section on Statistical Learning and Data Science

Presentations

Benchmarking Conformal Inference Under Covariate Shift

Conformal prediction has emerged as a robust framework for providing prediction uncertainty quantification with applications in large language models and genomics. However, these guarantees often rely on the exchangeability assumption, which is frequently violated in real-world scenarios due to covariate shift. While several adaptations maintain validity under distribution shift, the literature lacks a comprehensive, unified evaluation of their performance across diverse data regimes.

This paper presents a rigorous benchmark of state-of-the-art methods for conformal inference under covariate shift. We evaluate Weighted Conformal Prediction (Tibshirani et al.), Conformal Prediction Under Covariate Shift (Park et al.), Robust Conformal Prediction (Chernozhukov et al.), Entropy Balancing, Covariate Shift through Optimal Transport (Giguere et al.), Nex-CP (Barber et al.), and Conformal Prediction with Conditional Density Estimation (Borgwardt et al.).

Our study assesses these methods across high-dimensional synthetic and real-world datasets, focusing on coverage adherence, interval efficiency, and scalability. By identifying the strengths and failure modes of each approach, this 

Keywords

Conformal Inference

Covariate Shift

Conditional Coverage 

Speaker

Yimin Zhao, University of Washington

Co-Author

Michael Wu, Fred Hutchinson Cancer Center

Elastic Distance Metrics and Conformal Methods for Functional Classification and Regression

Functional covariates pose a challenge in classification and regression tasks. Functional data, by its definition is cursed by dimensionality and may have complicated structures, both of which hinder the use of functional covariates in supervised learning situations. In this work, we present a unique combination of elastic functional distance metrics and conformal prediction methods. Elastic distance metrics enable the measurement of functions in ways that capture both the underlying shape and phase variability, rather than focusing solely on magnitude variability. Conformal prediction methods provide uncertainty quantification (UQ) with theoretical coverage guarantees. We provide a comprehensive simulation study to examine the compatibility of various classification and regression algorithms with UQ metrics in the conformal prediction framework and an application of these methods to real world data to demonstrate utility and ease. The results demonstrate that the unique combination enhances predictive accuracy while delivering reliable uncertainty estimates for responses in the presence of functional covariates. 

Keywords

Functional Data Analysis

Conformal Prediction

Uncertainty Quantification 

Speaker

Gavin Collins

Co-Author(s)

Brandon Berman
Jason Adams

EMS Coreset: An efficient Expectation-Maximization algorithm for Sinkhorn Coreset

Coresets distill large datasets into small, representative subsets for efficient downstream learning. Yet Optimal Transport (OT)–based selection typically requires intensive computation of transport plans, limiting scalability. We introduce a scalable Sinkhorn coreset method that permits closed-form updates of the entropically regularized OT coupling by allowing non-uniform coreset weights. This produces centroids that generalize k-means via soft assignments. We establish asymptotic consistency of the selected measure and Lipschitz stability to data perturbations, providing accuracy and robustness guarantees. Across synthetic and real-world benchmarks, the proposed method achieves competitive or improved approximation quality while substantially reducing runtime compared to Wasserstein- and standard Sinkhorn-based coreset selection, especially at large scale. 

Keywords

Coreset

Optimal Transport

Data Distillation

Sinkhorn Loss

EM-algorithm 

Speaker

Haoyun Yin

Co-Author(s)

Chuanhui Liu
Xiao Wang, Purdue University

Fairness-Aligned Conformal Transport for Multivariate Mixed Outcomes

In high-stakes domains, decisions often hinge on jointly predicting multiple, correlated outcomes of mixed type (continuous, ordinal, categorical). Existing multivariate conformal methods impose restrictive geometric assumptions, perform poorly with mixed outcomes, or lack subgroup-conditional guarantees, leading to inflated prediction regions and uneven coverage. We propose FACTOR (Fairness-Aligned Conformal Transport for Optimal Regions), a framework for constructing compact and equitable prediction regions. FACTOR learns an optimal-transport map in a latent space via normalizing flows with input-convex neural networks, providing a principled multivariate ranking without shape constraints. To enforce fairness, we synchronize latent-space ranks across subgroups, yielding distribution-free marginal coverage and a finite-sample O(1/N) bound on subgroup calibration error. A sliding-window cutoff procedure then minimizes prediction region volume while preserving validity. Empirically, on synthetic and six real-world benchmarks, FACTOR consistently achieves target coverage with reduced region volume and subgroup disparities (KS distance) relative to state-of-the-art baselines. 

Keywords

Conformal prediction

Optimal transport

Fairness

Multivariate outcomes 

Speaker

Larry Han, Northeastern University

Co-Author

Chenyin Gao, Harvard University

Inference and Individualized Prediction via Sparse Wrapper Algorithms

Sparse Wrapper Algorithms (SWAG) generate collections of sparse, competitive models, providing an alternative to single-model selection in high-dimensional settings. This talk presents recent advances in statistical testing, post-selection inference, and individualized prediction within the SWAG framework. A permutation-based test is introduced to assess whether SWAG captures meaningful structure beyond chance by examining departures from uniform variable selection under the null. For post-selection inference, George p-values are used to quantify variable importance while accounting for selection uncertainty. Individualized prediction is achieved by stacking predictions across SWAG-selected models, leveraging model diversity to improve predictive stability and performance. Simulation studies and applications illustrate valid inference and improved prediction, highlighting SWAG as a unified approach to testing, inference, and personalized prediction. 

Keywords

Sparse Wrapper Algorithm

Rashomon inference

individualized prediction

Model stacking

post-selection inference 

Speaker

Yagmur Yavuzozdemir

Co-Author

Roberto Molinari, Auburn University

Optimal Variance Reduction with Multiple Synthetic Proxies: A Dynamic Control Variate Framework

Synthetic data analysis augments small, labeled datasets with massive unlabeled datasets containing proxy outcomes. Moving beyond existing single-proxy frameworks, we demonstrate that integrating multiple synthetic copies-either overlapping or disjoint-substantially amplifies estimation efficiency. Operating within a bias-correction framework for the parameter of statistical interest, we identify the optimal variance-minimizing weight for both linear and generalized linear models. This strategy guarantees a "free lunch" for variance reduction even with imperfect proxies, avoiding the model specification assumptions of traditional semi-supervised learning. Furthermore, to address uneven proxy quality, we introduce a dynamic coefficient approach that adapts the correction locally to maximize efficiency where proxies are most reliable. We validate the method through asymptotic theory, simulations, and an analysis of St. Louis housing prices, yielding significantly sharper estimates of school district quality capitalization compared to standard methods. 

Keywords

Synthetic Data Analysis

Prediction-Based Inference

Multiple Data Integration

Heterogeneous 

Speaker

Yuyang Li, Washington University in St. Louis

Co-Author(s)

XUMING HE, Washington University in St. Louis
Jimin Ding, Washington University At St. Louis

Rethinking Conformal Prediction for Binary Classification

In binary classification, standard Conformal prediction (CP) often collapses to the uninformative set $\emptyset$ or $\{0, 1\}$. We identify a structural cause: for any nonconformity score monotone in $\hat{p}(y|x)$, a nontrivial fraction of test points can receive two-label prediction sets. We also show that pointwise level shrinkage $\alpha(x)$ under the standard split CP formulation may not achieve conditional validity, yielding a second impossibility result. Motivated by these limits, we propose GLoSaM, a groupwise CP method with data-driven grouping and adaptive calibration that tightens prediction sets while retaining finite-sample distribution-free guarantees. Across synthetic benchmarks and binary LLM and vision classification settings, GLoSaM achieves valid groupwise coverage, substantially higher singleton rates, and robustness to score choice, outperforming e-value and multi-level conformal baselines. 

Keywords

Groupwise conformal prediction

Local adaptivity

Singleton Accuracy

Exchangeability

Conditional validity 

Speaker

Anqi Zhao

Co-Author(s)

Jungeum Kim, North Carolina State University
Shu Yang, North Carolina State University, Department of Statistics
YIchi Zhang, Department of Statistics, Indiana University Bloomington
Ke Zhu, NCSU and Duke