Data-Driven Reachability of Nonlinear Lipschitz Systems via Koopman Operator Embeddings

Published 31 Mar 2026 in eess.SY | (2604.00150v1)

Abstract: Data-driven safety verification of robotic systems often relies on zonotopic reachability analysis due to its scalability and computational efficiency. However, for nonlinear systems, these methods can become overly conservative, especially over long prediction horizons and under measurement noise. We propose a data-driven reachability framework based on the Koopman operator and zonotopic set representations that lifts the nonlinear system into a finite-dimensional, linear, state-input-dependent model. Reachable sets are then computed in the lifted space and projected back to the original state space to obtain guaranteed over-approximations of the true dynamics. The proposed method reduces conservatism while preserving formal safety guarantees, and we prove that the resulting reachable sets over-approximate the true reachable sets. Numerical simulations and real-world experiments on an autonomous vehicle show that the proposed approach yields substantially tighter reachable set over-approximations than both model-based and linear data-driven methods, particularly over long horizons.

Abstract PDF Upgrade to Chat

Authors (4)

Summary

The paper introduces a Koopman-based lifting that transforms nonlinear Lipschitz systems into a linear surrogate for precise reachability analysis.
It leverages zonotopic over-approximations along with rigorous error quantification to maintain formal safety guarantees despite noise and uncertainties.
The approach demonstrates reduced conservatism and improved performance over traditional methods, validated through numerical and experimental results on diverse systems.

Data-Driven Reachability of Nonlinear Lipschitz Systems via Koopman Operator Embeddings

Introduction

The paper "Data-Driven Reachability of Nonlinear Lipschitz Systems via Koopman Operator Embeddings" (2604.00150) proposes an advanced framework for safety verification via reachability analysis, targeting nonlinear Lipschitz systems with unknown dynamics. The introduced method synergizes data-driven system identification with formal verification. Notably, the approach leverages state-input-dependent Koopman operator embeddings learned from noisy trajectories, enabling the linearization of unknown, possibly non-affine, nonlinear systems in a lifted observable space. This allows the propagation of convex zonotopic over-approximations, which are then projected back to the original space to yield less conservative reachable set estimates with provable formal guarantees.

A fundamental limitation of existing scalable data-driven reachability methods—especially for nonlinear systems—is excessive conservatism in the presence of nonlinearity, long prediction horizons, or measurement noise. Model-based reachability, although sound, often fails when system parameters or nonlinearities are not precisely known. By contrast, the presented Koopman-based framework provides a strictly data-driven alternative that significantly reduces conservatism, as demonstrated numerically and experimentally.

Figure 1: Over-approximated reachable sets derived from the reachable sets of the Koopman-lifted system, with residual set modeling to account for approximation errors.

Koopman Operator Embeddings and Lifting Design

Central to the framework is the data-driven identification of a Koopman lifting, which provides a finite-dimensional linear surrogate for the underlying nonlinear discrete-time system: $x_{k+1} = f(x_k, u_k) + w_k$ where $w_k$ is bounded process noise and $f$ is an unknown nonlinear Lipschitz continuous map. The approach constructs a lifting $\psi(x_k, u_k)$ that includes state-dependent and state-input-dependent observables, allowing the nonlinear system to be approximated as: $\phi(x_{k+1}) = A \phi(x_k) + B \nu(x_k, u_k) + r_{\psi_k} + r_{w_k}$ Here, $\phi(x_k)$ are lifted states, $\nu(x_k, u_k)$ observables encoding input effects, and the matrices $A$ , $B$ are identified from trajectory data via least-squares regression.

This lifted parameterization is formally justified for non-affine systems—an explicit departure from prior LTI-based EDMD or purely model-based approaches. The selection of observables is problem dependent, and increasing their expressivity (e.g., using high-order polynomials or basis expansions) allows for more accurate representations but at increased computational cost.

Zonotopic Reachable Set Propagation

Reachability is conducted in the lifted space via zonotopic set arithmetic, exploiting the closure of zonotopes under affine maps and Minkowski sums. Key theoretical contributions include:

Rigorous Error Accounting: All uncertainty sources—including process noise, model approximation error, and coverage due to finite sampling—are quantitatively over-approximated in the lifted space and projected back, ensuring that the resulting reachable sets provably enclose all consistent true trajectories.
Residual and Noise Lifting: The Lipschitz property of the lifting functions permits computation of over-approximated residuals due to noise, providing tight error bounds in the presence of bounded measurement and process perturbations.
Data-Driven Error Extension: The method leverages the covering radius between the data and the state-input domain to safely extrapolate the residual model error beyond the direct support of the dataset, using Lipschitz continuity.

Compared with approaches relying on statistical (e.g., conformal prediction) guarantees, this mechanism ensures deterministic safety by robustly handling the compounding effects of multiple uncertainty sources through bounded zonotopic sets.

Numerical and Experimental Results

The method is extensively validated in multiple scenarios.

Affine System (CSTR):

For a nonlinear continuous stirred-tank reactor system, the proposed Koopman-based method yields significantly less conservative reachable sets compared to standard model-based and least-squares data-driven (LTI) set propagation, especially over long horizons.

Non-Affine System:

For a highly nonlinear, non-affine system, the Koopman-lifted framework again outperforms alternative methods, with substantial reductions in the volume and growth rate of the reachable sets, while maintaining over-approximation guarantees.

Real-World Autonomous Vehicle:

Experiments on an NVIDIA Jetson-based autonomous car platform (JetRacer ROS AI Kit, see Figure 2) with noisy input-state data further support the practical utility of the method. The reachable set propagation on the vehicle shows that the method achieves much tighter conformance with underlying system behavior than baseline data-driven approaches.

Figure 2: The JetRacer ROS AI Kit.

Implications and Future Directions

This work demonstrates that Koopman-based finite-dimensional embeddings with state-input-dependent observables represent a scalable, less conservative approach for reachability analysis in the absence of explicit models. The method ensures formal coverage with rigorously derived set over-approximations, making it suitable for deployment in safety-critical robotics and general nonlinear control applications where accurate models are unavailable or difficult to obtain.

Key theoretical implications include the formal bridging of data-driven identification and reachability analysis for systems with non-affine control structure and Lipschitz continuity. Practically, the demonstrated reduction in conservatism directly translates to more robust yet less restrictive constraint satisfaction in verification and motion planning pipelines. This is relevant for complex, high-dimensional robotic platforms where existing methods either lack robustness under uncertainty or scale poorly.

Potential directions for future research include extensions to time-varying and hybrid systems, incorporation of adaptive observable selection (potentially via automated basis discovery or neural surrogates with interpretable constraints), and further integration with scalable controller synthesis frameworks that explicitly leverage the formal reachability guarantees.

Conclusion

The presented Koopman-based data-driven reachability framework (2604.00150) achieves a formal, computationally efficient integration of system identification and reachability analysis for nonlinear, possibly non-affine, Lipschitz systems with unknown dynamics. Through advanced lifting, rigorous residual computation, and zonotopic propagation, the approach yields dramatic reductions in conservatism—particularly over long horizons and in noisy environments—without sacrificing safety guarantees. These results establish a foundation for further advances in scalable, formally verified data-driven robotics and control.

Markdown Report Issue