Data-Driven Reachability Analysis with Optimal Input Design

Published 6 Apr 2026 in eess.SY | (2604.04758v1)

Abstract: This paper addresses the conservatism in data-driven reachability analysis for discrete-time linear systems subject to bounded process noise, where the system matrices are unknown and only input--state trajectory data are available. Building on the constrained matrix zonotope (CMZ) framework, two complementary strategies are proposed to reduce conservatism in reachable-set over-approximations. First, the standard Moore--Penrose pseudoinverse is replaced with a row-norm-minimizing right inverse computed via a second-order cone program (SOCP), which directly reduces the size of the resulting model set, yielding tighter generators and less conservative reachable sets. Second, an online A-optimal input design strategy is introduced to improve the informativeness of the collected data and the conditioning of the resulting model set, thereby reducing uncertainty. The proposed framework extends naturally to piecewise affine systems through mode-dependent data partitioning. Numerical results on a five-dimensional stable LTI system and a two-dimensional piecewise affine system demonstrate that combining designed inputs with the row-norm right inverse significantly reduces conservatism compared to a baseline using random inputs and the pseudoinverse, leading to tighter reachable sets for safety verification.

Abstract PDF Upgrade to Chat

Authors (4)

Summary

The paper presents a novel framework that reduces conservatism in reachable set over-approximations using right-inverse optimization and active input design.
It employs an SOCP-based right-inverse to minimize the generator norm, achieving up to a 73% reduction in over-approximation volume in a five-dimensional LTI system.
The approach extends seamlessly to piecewise affine systems, ensuring sound hybrid zonotope propagation and robust safety verification under bounded noise.

Data-Driven Reachability Analysis with Optimal Input Design

Introduction and Problem Setting

The paper "Data-Driven Reachability Analysis with Optimal Input Design" (2604.04758) addresses the challenge of constructing accurate over-approximations of reachable sets for discrete-time linear systems with unknown dynamics, where only noisy input–state trajectory data are accessible. Traditional analytical reachability methods relying on precise models are unsuitable in many real-world scenarios due to unmodeled dynamics, limited data, or complexity. Prior data-driven frameworks, notably matrix zonotopes (MZ), provide set-membership guarantees but often yield conservative reachable sets, primarily due to suboptimal use of the collected data and limitations of the employed algebraic machinery.

The proposed approach systematically reduces conservatism through two primary mechanisms: (1) right-inverse optimization for the regression matrix, yielding smaller generator norms, and (2) A-optimal online input design to improve data informativeness. The paper extends these ideas to piecewise affine (PWA) systems via per-mode data partitioning and hybrid zonotope propagation.

Set-Theoretic Framework and Model Set Construction

The reachability analysis is built on the model-consistent set-valued formalism, where system matrices consistent with observed data and bounded process noise are embedded in a constrained matrix zonotope (CMZ). For discrete-time systems,

$x(k+1) = A_{\text{tr}} x(k) + B_{\text{tr}} u(k) + w(k),$

with $w(k) \in \mathcal{W}$ unknown and only input–state trajectories available, the data equation

$X_+ = [A_{\text{tr}}\;\; B_{\text{tr}}] \Phi + W_-$

allows the construction of a model set as a matrix zonotope, further tightened to a CMZ by imposing kernel-consistency constraints leveraging the regressor's nullspace.

The crucial step here is forming

$\mathcal{M}_\Sigma(H) = \mathcal{N}_0 H,$

where $\mathcal{N}_0$ is the denoised data set satisfying kernel constraints, and $H$ is any right-inverse of $\Phi$ ( $\Phi H = I$ ). The structure and size of the resulting reachable set over-approximations are determined by the choice of $H$ and by how tightly the CMZ constraints encapsulate the feasible set.

Row-Norm Right-Inverse Optimization

The choice of the right-inverse matrix $H$ is pivotal as it linearly transforms the generators of the disturbance set into the model set, impacting the conservativeness of the reachable over-approximation. Traditionally, the Moore–Penrose pseudoinverse is used since it minimizes the Frobenius norm. However, the sum of row norms of $w(k) \in \mathcal{W}$ 0 provides a sharper upper bound for the contribution of process noise, directly influencing the over-approximation's size.

Thus, the paper proposes solving the following SOCP:

$w(k) \in \mathcal{W}$ 1

This approach yields a right-inverse that minimizes the generator-norm proxy of the matrix zonotope, resulting in uniformly tighter reachable sets compared to the pseudoinverse, as formalized in Theorem~3 of the paper.

Online A-Optimal Input Design

To further decrease conservatism, the informativeness of the collected data is maximized using an online A-optimal input design scheme. By greedily maximizing the reduction in $w(k) \in \mathcal{W}$ 2—where $w(k) \in \mathcal{W}$ 3 is the information matrix built from all regressor vectors up to step $w(k) \in \mathcal{W}$ 4—the approach ensures that the regressor $w(k) \in \mathcal{W}$ 5 is as well-conditioned as possible, directly reducing the reachable set's over-approximation. Input selection is performed over a constrained zonotope input set by a combination of candidate sampling and local SQP refinement.

Theoretical results (Theorem~4) show that A-optimal input sequences strictly decrease the norm of the pseudoinverse and, thus, the generator-norm proxy of the model set for any feasible right-inverse. The effect compounds with the row-norm right-inverse optimization, providing maximally tight over-approximations.

Reachable Set Propagation and Extensions

Sound reachable-set propagation using the constructed model sets preserves inclusion guarantees, thanks to the monotonicity of the propagation operator with respect to set inclusion. When extended to piecewise affine systems, the approach associates separate CMZ models with each mode, utilizes guard splitting to appropriately propagate hybrid trajectories, and employs hybrid zonotopes to efficiently represent reachable sets across mixed logical dynamical branches.

Numerical Results and Empirical Validation

The strong empirical performance of the proposed methods is evidenced via detailed numerical experiments on a five-dimensional LTI system and a two-dimensional PWA system.

The reachable-set projections for the LTI system showcase that all designed-input variants are noticeably tighter than the corresponding random-input baselines. Notably, the combination of CMZ constraints and the SOCP right-inverse leads to the smallest set over-approximations (cyan contours), while both random and designed inputs preserve soundness.

Figure 1: Reachable-set comparison on the five-dimensional LTI system projected onto three distinct two-dimensional subspaces. All designed-input methods produce dramatically less conservative reachable sets than the random-input baselines. The tightest set is obtained by combining CMZ constraints with the SOCP right-inverse.

Quantitative volume computations at the final step indicate that using the SOCP right-inverse reduces over-approximation volume by $w(k) \in \mathcal{W}$ 6 relative to the random-input, pseudoinverse baseline. Input design alone gives a $w(k) \in \mathcal{W}$ 7 reduction; both methods are complementary.

For the PWA system, the method is extended per mode. Visual comparison confirms that both random- and designed-input data-driven over-approximations contain the model-based ground truth, and the designed-input set is substantially smaller.

Figure 2: PWA system reachable set comparison over 10 steps. The designed-input variant yields a tighter over-approximation than the random-input baseline, while both soundly contain the model-based set.

Theoretical and Practical Implications

The paper establishes that integrating right-inverse optimization with active input design provides a generic, theoretically guaranteed, and computationally tractable approach for reducing conservatism in data-driven reachability. The results are directly transferable to safety verification and control synthesis for unknown or partially known systems under bounded uncertainty, provided only access to input–state data and constraints. The approach naturally extends to hybrid and piecewise affine systems, requiring only per-mode application of the set-based machinery.

Practically, the proposed input design integrates seamlessly into routine data collection protocols, and the SOCP for right-inverse optimization scales linearly in the number of timesteps, making the method suitable for high-dimensional or fast-sampling applications.

Future Directions

Important avenues for future work include mitigating the combinatorial branching in hybrid dynamics, extending the input design horizon (e.g., multi-step or receding horizon strategies), and exploiting the algebraic structure in the matrix-zonotope and constrained-zonotope product to gain tighter approximations and improved scalability.

Conclusion

The presented framework leverages model-consistent set descriptions, right-inverse optimization, and A-optimal input design to obtain less conservative, sound over-approximations of reachable sets from bounded-noise input–state data. Both theoretical and empirical analyses verify that the combination of these two strategies—right-inverse optimization and informative experiment design—yields the tightest certified outer approximations possible within the matrix zonotope class, without sacrificing tractability or soundness. The approach is directly extensible to piecewise affine systems and is broadly applicable in data-driven CPS verification and robust control.