- The paper establishes structural equivalences between PAC guarantees from conformal prediction, scenario optimization, and the holdout method in data-driven reachability analysis.
- It demonstrates that scenario optimization with sample discarding reduces reachable set volume and produces ex post guarantees, while conformal prediction offers marginal, ex ante coverage requiring larger samples.
- Empirical validation using the chaotic Duffing oscillator confirms that appropriately parameterized methods yield nearly identical PAC bounds, highlighting key trade-offs in sample complexity and computational efficiency.
PAC Guarantees for Data-Driven Reachability: Theoretical Equivalences and Empirical Perspectives
Introduction
This work rigorously examines Probably Approximately Correct (PAC) guarantees across prominent data-driven reachability analysis techniques, focusing specifically on conformal prediction, scenario optimization (with sample discarding), and the holdout method. The primary motivation is to formally connect the probabilistic guarantees provided by these methods when explicit dynamical system models are unavailable or intractable, and only trajectory data is accessible. The results emphasize the structural and interpretational distinctions between these methods and evaluate practical implications in safety-critical verification domains.
Data-Driven Reachability with PAC Guarantees
Finite-horizon reachability analysis aims to estimate the set of states a system can reach from given initial conditions under disturbances. In contrast to model-based approaches guaranteeing absolute safety, data-driven methods utilize trajectory samples and deliver probabilistic guarantees: a learned reachable set contains most, but not necessarily all, unseen system evolutions with high probability.
All methods of interest produce outer approximations R^ of the true reachable set with guarantees of the form:
PN(Error>ε)≤β
where the error denotes the violation probability---that an unseen trajectory lies outside the estimated reachable set---with significance level β.
Scenario Optimization with Sample Discarding
Scenario optimization is rooted in chance-constrained programming, turning intractable probabilistic constraints into tractable deterministic constraints on N i.i.d. system trajectories. The classic sample-and-discard variant permits removal of up to k violated constraints to minimize conservatism in the learned set. Key results include sharply characterizing the allowable k in terms of accuracy ε and confidence β, producing explicit PAC hazard bounds via combinatorial or tight inequality approximations (cf. [scenario-sample-discard]).
Sample discarding enables non-trivial volume reduction in the approximation, contingent on the convexity of the parameterized reachable set (in the decision variables θ) and independence of samples. The program’s solution formalizes as:
θmin Vol(θ) subject to g(δ(i),θ)≤0 (N−k times)
and PAC bounds relate PN(Error>ε)≤β0 (number of parameters), PN(Error>ε)≤β1, and PN(Error>ε)≤β2.
Split conformal prediction calibrates a learned predictor (the reachable set estimator) with additional holdout data and computes coverage sets based on empirical (nonconformity) quantiles. The user specifies a target error PN(Error>ε)≤β3, and the procedure constructs a set-valued prediction tanked to contain PN(Error>ε)≤β4 fraction of unseen outcomes marginally. Importantly, the split conformal method yields marginal coverage, and its PAC guarantees are manifested as tail probabilities on the Beta distribution induced by calibration set size PN(Error>ε)≤β5 and error level PN(Error>ε)≤β6. The bound on violation probability for an unseen sample, conditional on calibration data, is explicitly
PN(Error>ε)≤β7
where PN(Error>ε)≤β8 can be computed from the Beta distribution quantiles.
Holdout Method
The holdout approach relies purely on empirical error counting in a test set of PN(Error>ε)≤β9 fresh i.i.d. samples. For a given reachable set approximation, the frequency of violation among holdout samples supplies a binomial tail inversion confidence bound on the true violation probability:
β0
where β1 violations among β2 samples yield, via inversion of the binomial CDF, a tight PAC bound on the error.
A key contribution is the rigorous demonstration that, appropriately parameterized, the PAC bounds arising from split conformal prediction, scenario optimization with sample discarding, and the holdout method are structurally equivalent:
- Conformal vs. Holdout: An empirical conformal procedure, when the nonconformity score is based on violations and with parameters chosen a posteriori, yields exactly the binomial tail PAC bound of the holdout method. The equivalence is made precise by mapping the Beta distribution (of conformal training conditional coverage) to the incomplete Beta function form of the binomial CDF tail [pmlr-v25-vovk12], [angelopoulos2022gentleintroductionconformalprediction].
- Conformal vs. Scenario Optimization: By relating the retained sample quantiles in both methods, the authors show that split conformal prediction (with training-conditional coverage) and sample-discarding scenario optimization can be parameterized to yield identical reachable sets and structurally identical PAC bounds. This is possible via joint solution of calibration set size β3 and error levels β4 so that both methods discard β5 samples and retain β6 (cf. derived expressions and Proposition/Theorem mapping).
Despite formal parameterization, the key distinction is interpretational: scenario optimization provides an ex post PAC guarantee (a property of the particular learned set and database), while conformal prediction’s standard treatment is ex ante (valid for future unseen data with high probability).
Empirical Study
The equivalences are empirically validated using the chaotic Duffing oscillator as a benchmark system. Reachable sets are estimated via both conformal and scenario pipelines, and the PAC bounds are calculated across multiple trials:
- Empirical Conformal/holdout: The distributions of violation probabilities β7 match up to numerical precision across 50 repeated experiments, confirming exact equivalence when operating on the same data [(2604.02953), Fig. 2].
- Conformal vs. Scenario Optimization: For large numbers of calibration samples, the retained reachable set volumes obtained via both methods are indistinguishable. However, for smaller calibration sets, the conformal set contracts to counteract calibration uncertainty, potentially producing excessive conservatism, while scenario optimization maintains a less conservative approximation.
The analysis also reveals that while the computational overhead for both methods is low when discarding is performed via sorted quantile thresholds, conformal prediction requires significantly larger sample sizes to compensate for variance in coverage, especially for stringent confidence parameters β8.
Implications, Limitations, and Future Directions
This theoretical and empirical analysis yields several key insights:
- Practical Equivalences: Despite their disparate origins, with proper design, all three methods can be used for PAC-style safety certificates in black-box system settings. Their guarantees are not strictly interchangeable, as the ex ante vs. ex post distinction influences how practitioners interpret and deploy results.
- Sample Complexity: Conformal prediction’s sample requirements for narrow PAC bounds (training-conditional) far exceed those of scenario optimization, resulting in practical trade-offs between accuracy and computational feasibility.
- Computational Considerations: Both scenario optimization and conformal prediction admit efficient implementations in typical convex parameterizations, but scenario optimization’s ex post bounds align better with application-driven safety constraints.
On the theoretical side, these results encourage transfer of algorithmic innovations between the UQ community (where conformal prediction is prevalent) and control/verification researchers employing scenario optimization. Practically, the findings encourage careful attention to the selection of the verification method context---especially in settings with tight sampling constraints or when future-predictor coverage is required.
Future work should investigate extensions to non-convex parameterizations, adaptive discarding policies, and formalize ex ante PAC bounds for scenario-based methods. Hybrid approaches combining data-driven with model-based partial knowledge may further sharpen the achievable guarantees.
Conclusion
This paper establishes foundational connections between PAC guarantees from conformal prediction, scenario optimization (sample discarding), and the holdout method in data-driven reachability analysis. The results rigorously map the structure and interpretation of these bounds, elucidate the underlying probability-theoretic equivalences, and clarify their computational and sample trade-offs. Scenario optimization emerges as the most sample-efficient approach for ex post PAC-guaranteed reachable set estimation under convex parameterization, while conformal approaches remain valuable for marginal (ex ante) coverage settings. The techniques and equivalences detailed here set the stage for more systematic, theory-grounded deployment of data-driven verification and safety certification algorithms in dynamical systems without explicit models.
References:
See (2604.02953) and cited works therein for full technical details and additional empirical illustrations.