Neural Optimal Transport Meets Multivariate Conformal Prediction
Published 29 Sep 2025 in stat.ML and cs.LG | (2509.25444v1)
Abstract: We propose a framework for conditional vector quantile regression (CVQR) that combines neural optimal transport with amortized optimization, and apply it to multivariate conformal prediction. Classical quantile regression does not extend naturally to multivariate responses, while existing approaches often ignore the geometry of joint distributions. Our method parametrizes the conditional vector quantile function as the gradient of a convex potential implemented by an input-convex neural network, ensuring monotonicity and uniform ranks. To reduce the cost of solving high-dimensional variational problems, we introduced amortized optimization of the dual potentials, yielding efficient training and faster inference. We then exploit the induced multivariate ranks for conformal prediction, constructing distribution-free predictive regions with finite-sample validity. Unlike coordinatewise methods, our approach adapts to the geometry of the conditional distribution, producing tighter and more informative regions. Experiments on benchmark datasets show improved coverage-efficiency trade-offs compared to baselines, highlighting the benefits of integrating neural optimal transport with conformal prediction.
The paper's main contribution is a unified framework integrating neural optimal transport with conformal prediction to generate geometry-adaptive multivariate uncertainty sets.
It leverages convex neural potentials, amortized optimization, and entropic regularization to achieve efficient and accurate conditional vector quantile regression.
Empirical results show superior predictive set calibration and generative modeling performance over existing methods in high-dimensional settings.
Neural Optimal Transport for Multivariate Conformal Prediction
Introduction and Motivation
The paper "Neural Optimal Transport Meets Multivariate Conformal Prediction" (2509.25444) introduces a unified framework for multivariate conditional uncertainty quantification that leverages advances in optimal transport (OT) theory, convex neural potentials, and conformal prediction (CP). The core problem addressed is the lack of tractable, geometry-aware predictive regions for multivariate responses: traditional quantile regression does not extend naturally to Rd, and most conformal methods for multivariate outputs revert to marginal, axis-aligned, or otherwise inflexible uncertainty sets, failing to exploit the true geometry of the joint distribution. This work offers a scalable, theoretically principled solution by combining neural parameterizations of conditional vector quantiles and ranks with adaptive, distribution-free conformal inference.
Conditional Vector Quantile Regression with Neural Optimal Transport
The authors build upon the foundational work that interprets multivariate quantiles and ranks as OT maps between a reference measure (e.g., multivariate uniform or Gaussian) and the conditional law of Y∣X [carlier2016vector, hallin2021distribution]. For each x, the conditional quantile function is encoded as the gradient of a convex potential φ(u,x), guaranteeing cyclic monotonicity and invertibility (almost everywhere). Crucially, they address the intractability of classical OT-based conditional vector quantile regression (CVQR) by parameterizing φ with a Partially Input Convex Neural Network (PICNN), which is convex in u and conditions on x (see Section~\ref{sec:Neural_ot_VQR}). The conjugate potential and associated rank map are computed by convex duality; inversion and sampling are efficiently handled by modern convex optimization and implicit differentiation algorithms.
To further ameliorate the computational challenges, especially in high dimensions, the authors introduce amortized optimization for the dual potential: a neural network is trained to predict (rather than solve from scratch) the maximizer in the Fenchel conjugacy for each (y,x). This substantially speeds up both training and test-time inference, maintaining the convexity/monotonicity structure essential for proper quantile maps.
Additionally, an entropic regularized variant is presented—introducing an entropic-OT penalty makes the objective smooth and unconstrained, which can be efficiently minimized via stochastic gradient methods, at the cost of introducing a bias in the transport geometry.
Figure 1: Points sampled from the reference distribution (left) and the model-generated distribution via the C-NQRU​ approximation (right), demonstrating accurate conditional generative modeling.
Conformal Prediction with Multivariate OT-based Ranks
Conformal prediction provides distribution-free, marginal coverage guarantees for predictive sets, but its adaptation to the multivariate setting is nontrivial. The paper defines multivariate conformity scores by composing the learned vector rank map QY∣X−1​ with the observed data, and constructing predictive regions as pullbacks of centrally symmetric balls in the latent (OT reference) space.
Given a calibration set, the conformity score for each observed (Yi​,Xi​) is Si​=∥Q​Y∣X−1​(Yi​,Xi​)∥, and the prediction set for a new input x is the preimage of the smallest ball in the reference space containing 1−α of the scores. Under conditions satisfied by elliptical or radially symmetric conditional distributions, the authors prove these sets are Lebesgue measure–optimal (minimum volume at prescribed coverage), and they correspond to highest probability density (HPD) regions.
To further reduce potential miscalibration from misspecified rank functions, an additional reranking step is included (via empirical OT on the calibration set), correcting for anisotropy or distortion in the estimated reference mapping.
Empirical Results
Extensive experiments on both synthetic and real-world multi-output regression datasets evidence the utility and efficiency of the proposed methods.
For generative modeling, the neural OT-based quantile maps achieve low Sliced Wasserstein-2 (S-W2) distances to ground-truth and outperform previous methods including classical VQR, recent function-approximate VQR, and convex potential flows. Particularly, amortized conjugate models (AC-NQR) achieve superior or competitive distances, while requiring less wall-clock time for both training and inference.
Figure 2: Sliced Wasserstein-2 (S-W2) metric across dimensions for Neal’s Funnel, demonstrating stable and scalable generative modeling even in high dimensions.
The L2 unexplained variance metric on convex synthetic datasets highlights the sharpness with which the true conditional quantile map is recovered, especially for the amortized and dual-potential neural approaches.
In conformal prediction, prediction set sizes (normalized log-volumes) and worst-slab coverage values indicate robust and non-conservative uncertainty quantification. The constructed regions are not only valid but also closely track the geometry of the ground-truth conditional law, in contrast to axis-aligned boxes or coordinatewise methods.
Figure 3: Normalized log-volume of prediction sets for various conformal methods, averaged over multiple real datasets, with nominal α=0.1.
Figure 4: Worst slab coverage at varying miscoverage levels, indicating that most methods achieve or exceed nominal coverage, but the proposed neural OT pullback sets maintain lowest volume with sharp conditional coverage.
Full metric traces on synthetic datasets (e.g., "Banana", "Star", "Glasses") further demonstrate generative fidelity and calibration (see Figures 6–11).
Theoretical and Practical Implications
The approach realizes for the first time a practical, geometry-adaptive framework for multivariate conformal prediction using neural optimal transport. The main theoretical innovation is to directly exploit the invertible, monotone structure of vector quantile maps parameterized by neural networks, thus inheriting the strong coverage guarantees of CP and the geometric optimality of OT.
In practice, this leads to scalable uncertainty quantification for regression with high-dimensional, complex, and heteroskedastic targets. The amortized optimization and entropic variants make training feasible on large benchmark datasets, and the set-valued predictions adapt to non-elliptical, multi-modal, or varying geometries not accessible by traditional methods.
The work also shows empirically and theoretically that simply applying a composition of 1D CP or coordinatewise sets is highly suboptimal for true conditional coverage and efficiency. Their sets are universally smaller at the same coverage and align with HPD regions when the model class is well-specified.
Future Directions
The framework naturally extends to incorporate other forms of geometric regularization, richer neural parameterizations (e.g., deeper PICNNs, invertible flows), and alternative OT costs. One promising avenue is the integration with generative models for robust modeling under distribution shift, or for implicit density estimation in complex domains (e.g., molecular, image, or temporal settings).
Open theoretical questions include optimality and sharpness in the presence of misspecification or overparameterization, and whether amortized maps retain coverage under adversarial perturbations. Finally, efficient computation of Jacobians and determinants (for density-based CP sets) in very high dimensions remains a practical challenge.
Conclusion
This paper presents a principled and practical solution for valid and efficient multivariate uncertainty quantification by combining neural optimal transport-based conditional vector quantile regression with conformal calibration. The approach outperforms baseline and recent methods on synthetic and real tasks, advances the state of the art for both geometry-aware generative modeling and predictive inference, and lays the groundwork for further developments at the intersection of OT, deep learning, and sample-splitting conformal inference.