Optimal training-conditional regret for online conformal prediction

Published 18 Feb 2026 in math.ST, cs.IT, cs.LG, and stat.ML | (2602.16537v1)

Abstract: We study online conformal prediction for non-stationary data streams subject to unknown distribution drift. While most prior work studied this problem under adversarial settings and/or assessed performance in terms of gaps of time-averaged marginal coverage, we instead evaluate performance through training-conditional cumulative regret. We specifically focus on independently generated data with two types of distribution shift: abrupt change points and smooth drift. When non-conformity score functions are pretrained on an independent dataset, we propose a split-conformal style algorithm that leverages drift detection to adaptively update calibration sets, which provably achieves minimax-optimal regret. When non-conformity scores are instead trained online, we develop a full-conformal style algorithm that again incorporates drift detection to handle non-stationarity; this approach relies on stability - rather than permutation symmetry - of the model-fitting algorithm, which is often better suited to online learning under evolving environments. We establish non-asymptotic regret guarantees for our online full conformal algorithm, which match the minimax lower bound under appropriate restrictions on the prediction sets. Numerical experiments corroborate our theoretical findings.

Abstract PDF Upgrade to Chat

Summary

The paper introduces DriftOCP, a method that efficiently recalibrates prediction intervals in both abrupt and smooth drift regimes.
It establishes non-asymptotic minimax-optimal regret bounds for pretrained and online-trained scoring regimes using drift detection.
Empirical evaluations demonstrate that DriftOCP achieves stable coverage and rapid adaptation compared to existing adaptive conformal methods.

Minimax-Optimal Online Conformal Prediction under Distribution Drift

Problem Formulation and Motivation

This paper analyzes online conformal prediction for sequential data streams with temporal distribution drift, departing from adversarial settings and focusing instead on training-conditional cumulative regret as the primary metric. The independence assumption across data enables the evaluation of coverage in a manner aligned with classical conformal validity and allows for minimax analysis in non-stationary environments, including both abrupt (change-point) and smooth drift regimes.

Prior works largely studied adversarial marginal coverage or adversarial regret, which may be decoupled from the classical notion of coverage and can fail to ensure valid per-time predictions. The paper demonstrates that long-term time-averaged coverage is generally insufficient for reliable statistical guarantees and motivates cumulative regret measured via training-conditional coverage gaps.

Algorithms: DriftOCP for Pretrained and Online-Trained Scores

Pretrained Scores Regime

For the scenario where non-conformity scores are pretrained independently, the paper proposes DriftOCP, which leverages a drift detection subroutine to dynamically and efficiently recalibrate prediction intervals. The calibration set is adaptively updated, enabling robust handling of data streams with unknown drift. The procedure is horizon-free and computationally efficient, as quantile statistics and empirical coverage are maintained incrementally.

Online-Trained Scores Regime

In settings where scores and models are trained online, the proposed full-conformal version of DriftOCP eschews permutation symmetry and instead relies on stability of the model fitting algorithm (e.g., online SGD; see Assumptions~\ref{ass:lip_cond_distr} and \ref{ass:fair_alg}). Calibration windows are adaptively selected based on drift statistics, and quantile recalibration is implemented via a doubled-round structure that naturally adapts to drift without explicit knowledge of the horizon or drift structure.

Theoretical Guarantees

Regret Bounds

The paper establishes non-asymptotic upper bounds for training-conditional cumulative regret, matching minimax lower bounds up to logarithmic factors across both change-point and smooth-drift settings:

Change-point regime: Regret scales as $\widetilde{O}(\sqrt{(N^{\mathrm{cp}} + 1)T})$ , providing dependence on the unknown number of change points.
Smooth drift regime: Regret scales as $\widetilde{O}(\sqrt{T} + (\mathrm{KS}_T)^{1/3} T^{2/3})$ (for pretrained scores) or $\widetilde{O}(\sqrt{(L+1)T} + (\mathrm{TV}_T)^{1/3} T^{2/3})$ (for online-trained scores), where $L$ denotes the stability parameter and $\mathrm{KS}_T$ , $\mathrm{TV}_T$ quantify cumulative drift.

The regret depends on score-based Kolmogorov–Smirnov (KS) distances for pretrained models and total-variation (TV) distance for online-trained models, highlighting the tighter control achievable when leveraging score-space adaptation.

Minimax Lower Bounds

Matching minimax lower bounds are proven for cumulative regret, under both score-based and set-based construction, with the bounds expressed in terms of structural complexity ( $K$ -interval prediction sets) and drift budget. The bounds rule out vacuous procedures and underscore the necessity of geometric or functional restrictions for meaningful minimax analysis.

Numerical Results

Empirical evaluations demonstrate DriftOCP's superiority over Adaptive Conformal Inference (ACI) in adapting to both abrupt and smooth drift. Notably, DriftOCP achieves:

Stable tracking of calibration quantiles during stationary phases.
Rapid re-alignment post-change points, yielding consistently controlled regret across regimes.
Uniform coverage behavior regardless of underlying drift structure—see results below.
Figure 1: Cumulative regret and calibration quantiles under four data-generating settings; DriftOCP achieves stable regret over stationary segments and rapid adaptation to distributional shifts.

The full-conformal algorithm with online adaptively-trained scores yields tighter prediction intervals and more stable coverage, especially under variance drift, compared to pretrained baselines and covariate-agnostic scores.

Figure 2: Online conformal prediction with pretrained vs. adaptive score strategies; adaptive-score methods (SGD) achieve short intervals and stable coverage, outperforming pretrained and model-free approaches, especially under misspecification and drift.

Practical and Theoretical Implications

The proposed framework offers:

Data-efficient, horizon-free conformal adaptation for online predictive inference.
Adaptivity to unknown distributional structures, with no prior knowledge required for drift magnitude, change points, or calibration window selection.
Minimax-optimal regret guarantees under both stationary and non-stationary environments.
Theoretical justification for stable online learning as sufficient for (non-vacuous) conformal inference under distribution drift, even when permutation symmetry breaks.

Practically, these results support robust uncertainty quantification in non-stationary data environments common in real-world streaming and sequential regression problems. The approach is computationally tractable and compatible with black-box predictive models, including those trained via online convex optimization protocols.

Theoretically, the minimax analysis sets strong limits for adaptive online covering, establishing when and how training-conditional coverage can be achieved, and pointing to open questions for dependent data, nonparametric, and deep learning settings where stability may not be attainable.

Speculation and Future Directions

Future work will need to address:

Extension of these methods to temporally dependent and non-exchangeable settings (e.g., Markovian or time series data).
Training-conditional guarantees for full-conformal methods in unstable or nonparametric regimes, such as deep learning.
Application and adaptation of the horizon-free, drift-aware calibration framework to other online statistical domains (e.g., online multicalibration, robust time-series forecasting).

Conclusion

This paper resolves a fundamental gap in online conformal prediction under distribution drift, delivering minimax-optimal algorithms and regret analysis for both pretrained and adaptively-trained score regimes. The methods are practical, computationally scalable, and theoretically sound, with strong guarantees in non-stationary environments. The rigorous minimax characterization, together with data-driven calibration protocols, provide a robust foundation for sequential uncertainty quantification in high-dimensional, evolving data streams.