Resilient Observer Design

Updated 9 November 2025

Resilient observer is an estimation architecture that maintains accurate state estimation using redundancy against sensor, communication, and Byzantine faults.
It employs a dual-stage method combining mode separation via coordinate transform and local filtering-based resilient estimation for robust performance.
Convergence guarantees ensure that non-compromised nodes reliably recover the true state, leveraging strong-robustness and redundancy in networked systems.

A resilient observer is an estimation architecture or algorithm designed to maintain accurate state estimation in the presence of adversarial disruptions, including sensor/actuator attacks, communication failures, network-induced faults, and arbitrary malicious behaviors by system nodes. Such observers form a central component in resilient control and secure state estimation for cyber-physical systems, enabling correct operation even under the most severe threat models, such as Byzantine adversaries. Rigorous analysis of resilient observers requires the precise formulation of fault or attack models, robust estimator design anchored in system and network redundancy, and quantification of theoretical performance guarantees under worst-case conditions.

1. Threat Model and Problem Setting

In the canonical setting, an LTI (Linear Time-Invariant) system is monitored by a network of $N$ agents (sensor/estimator nodes), connected via a directed communication graph. The system evolves as

$x[k+1] = A\,x[k], \quad y_i[k] = C_i x[k], \quad i=1,\ldots,N,$

with $(A,C)$ detectable but detector capabilities $(A,C_i)$ for each agent $i$ possibly incomplete. The principal threat is a set $\mathcal{A}\subset\{1,...,N\}$ of adversarial (“Byzantine”) nodes that possess complete system and network knowledge and may behave arbitrarily, including transmitting inconsistent data to different neighbors, active collusion, and knowledge of all estimation protocols in use (Mitra et al., 2018).

The fundamental challenge lies in designing a distributed observer that can guarantee correct state estimation by all non-compromised nodes despite arbitrary, possibly dynamic, coalition attacks. A “resilient observer” in this context is a finite-memory, causal, possibly randomized algorithm that provably recovers $x[k]$ at each regular node, subject to carefully quantified limitations from system structure and adversary distribution.

2. Fundamental Limitations and Necessary Redundancy

Resiliency is not a property that can be achieved for all LTI systems under arbitrary network and sensing arrangements. The impossibility results in (Mitra et al., 2018) show:

Critical Sets: Any subset $F$ (of nodes) such that removal renders $(A,C_{V\setminus F})$ undetectable must be robustly “covered” in the topology.
For each unstable eigenvalue $\lambda$ of $A$ , there must be at least $2f+1$ nodes in the network with measurements that can directly detect $\lambda$ .
If, for any node $i$ whose own measurements are not detectable, after deleting up to $2f$ in-neighbors there does not remain a measurement path detecting the plant, then no synchronous, deterministic algorithm can assure estimation at $i$ .

This enforces the necessity of double-redundancy in both measurements and communication for $f$ -local adversary models. In particular, fundamental limits establish that resilient distributed state estimation is possible if and only if, for each undetectable node, every set of up to $2f$ of its in-neighbors fails to “cut” the network’s detectability.

3. Strong-Robustness Property and r-Feasibility

The fundamental graph-theoretic notion enabling resilient observer design is strong- $r$ -robustness:

Given a set of “source” nodes $S$ (capable of autonomously detecting a given unstable mode), the graph $G = (V, E)$ is strongly $r$ -robust w.r.t. $S$ if every nonempty $X\subset V\setminus S$ contains a node with at least $r$ in-neighbors outside $X$ .

If $G$ is strongly $r$ -robust w.r.t. the source set for each unstable eigenmode, and the system is detectable, the triple $(A,C,G)$ is called $r$ -feasible. This condition quantifies the distributed measurement and communication redundancy needed to guarantee attack resiliency up to $f = \lfloor(r-1)/3\rfloor$ adversaries per node.

4. Byzantine-Resilient Distributed Observer Architecture

The resilient observer architecture in (Mitra et al., 2018) involves two interlocking estimation procedures at each regular node:

Mode Separation via Coordinate Transform: The global dynamics are transformed into real Jordan canonical form. Each agent $i$ identifies the set of unstable modes it can detect and runs a standard Luenberger observer on those.

$\hat z_{\mathcal{O}_i}[k+1] = M_{\mathcal{O}_i}\,\hat z_{\mathcal{O}_i}[k] + L_i\,\left(y_i[k] - C_{\mathcal{O}_i}\hat z_{\mathcal{O}_i}[k]\right)$

Local Filtering-Based Resilient Estimation (LFRE):
- For each unstable mode $\lambda_j\notin \mathcal{O}_i$ , node $i$ finds a subset $N_i^{(j)}$ of in-neighbors with reliable communication and access to estimates of $z^{(j)}$ .
- At each time:
  - Node $i$ collects all neighbor estimates for $z^{(j)}[k]$ .
  - For each scalar subcomponent, largest $f$ and smallest $f$ entries are discarded, leaving $\geq r-2f$ survivors.
  - Any convex combination of survivors is used as the updated estimate; the convexity ensures the estimate remains within the “safe” interval defined by honest agents.
  - The state block is updated via the real-Jordan map: $\hat z_i^{(j)}[k+1] = W(\lambda_j)\,\bar z_i^{(j)}[k]$ .

Finally, the agent reconstructs the full state estimate $\hat x_i[k]=T\hat z_i[k]$ .

The key insight is that this algorithm does not require global consensus, majority broadcasts, or heavy computation; rather, resilience emerges from redundancy in filtering, aggressive elimination, and convex combination.

5. Convergence Guarantees and Correctness

If the network is strongly $(3f+1)$ -robust w.r.t. the source set for each unstable mode, then for any $f$ -local adversary, all regular nodes’ estimates converge exactly to the true state: $\|\hat x_i[k] - x[k]\| \to 0,\quad \forall\, i \notin \mathcal{A}$

The proof proceeds by induction over an acyclic layering of the graph induced by the MEDAG for each mode, tracking the impact of filtering and the survivability of honest estimates through the convex hull argument. Because at least one survivor in each step is guaranteed to be honest (and not eliminated), and mode-source estimates converge accurately, all errors are ultimately compressed to zero under the stable propagation.

6. Computational Complexity and Topology Verification

The strong- $r$ -robustness property for a given source set $S$ can be checked in polynomial time by analogy to the threshold- $r$ bootstrap percolation problem: starting from $S$ , nodes are added iteratively if they have at least $r$ neighbors in the growing set. The process terminates in at most $N$ rounds, with each round requiring $O(|E|)$ operations. For a system with $n$ unstable modes, total verification complexity is $O(n N |E|)$ .

The construction thus allows network designers to a priori verify whether a given topology suffices for resilience and facilitates scalable synthesis for large networks.

7. Scaling Behavior and Applicability

Resilient observers designed with strong-robustness scale naturally and are applicable to a wide class of network models:

Preferential-Attachment Networks (Barabási–Albert): New nodes attach to $\geq r$ existing nodes, preserving $r$ -feasibility inductively.
Erdős–Rényi Random Graphs: With $p \gg (\ln N)/N$ and source set size $\geq c\ln N$ , strong $r$ -robustness holds for $r \approx c\ln N$ with high probability.
Random–Geometric Graphs: Connectivity and robust percolation threshold ensures attainable $r$ -feasibility for $f \lesssim (r-1)/3$ .

Simulations, even in simple scalar divergent systems ( $x[k+1]=2 x[k]$ ), demonstrate that standard consensus observers fail under even a single constant-bias attack, whereas the LFRE with $(3f+1)$ -filtering yields exact tracking for all honest nodes.

8. Significance and Broader Context

Resilient observers, as formalized above, provide a rigorous, practically-realizable methodology for maintaining distributed state estimation in adversarial and compromised network environments. Their central contribution is a precise analytic bridge between graph-theoretic structural properties (strong-robustness) and achievable estimation guarantees, under the most powerful threat models permitted by information theory. The algorithm avoids the need for centralization, explicit attack identification, or computationally expensive global optimization, instead leaning on measurement redundancies and local message filtering.

This conceptual framework enables a host of further generalizations—e.g., weighted/moving-horizon observers, resilient fusion for nonlinear and switching systems, and extension to stochastic or event-triggered communication settings—and has directly influenced advances in secure control, distributed diagnosis, and multi-agent safety. It anchors the theoretical limit on how much adversarial corruption can be tolerated, showing the optimality of the “$2f+1$ redundancy per mode” limit for generic LTI networks with arbitrary Byzantine communication (Mitra et al., 2018).

Markdown Report Issue Upgrade to Chat

References (1)

Byzantine-Resilient Distributed Observers for LTI Systems (2018)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Resilient Observer.