ARMA Graph Filtering in Signal Processing

Updated 24 January 2026

ARMA graph filtering is defined as rational functions of graph shift operators, unifying spectral and vertex-domain processing.
It leverages distributed, recursion-based implementations and dynamic adaptations to efficiently filter signals on arbitrary graphs.
Design methods use constrained optimization and iterative techniques to ensure stability, noise robustness, and superior spectral approximation.

Autoregressive Moving Average (ARMA) Graph Filtering is a class of graph signal processing methods that model graph filters as rational functions of a graph shift operator, extending classical ARMA filter theory from time series to signals supported on the vertices of arbitrary graphs. This framework unifies spectral and vertex-domain perspectives, enables robust distributed implementations, and has become fundamental in modern graph signal processing and geometric deep learning.

1. Fundamental Concepts and Formal Definitions

ARMA graph filters are defined by their operation on a graph signal $x \in \mathbb{R}^N$ , where $N$ is the number of nodes, and a symmetric graph shift operator $S$ (e.g., adjacency or Laplacian). The graph Fourier transform (GFT) diagonalizes $S$ , yielding $S = U \Lambda U^\top$ , where $U$ contains eigenvectors $\{\varphi_n\}$ , and $\Lambda = \text{diag}(\lambda_1,...,\lambda_N)$ holds the eigenvalues (graph frequencies).

A generic ARMA(M,N) graph filter has spectral response

$H(\lambda) = \frac{\sum_{k=0}^M b_k \lambda^k}{1 + \sum_{k=1}^N a_k \lambda^k}$

where $\{b_k\}$ and $N$ 0 are the moving-average and autoregressive coefficients, respectively. Application of the filter in the graph domain is achieved via

$N$ 1

(Loukas et al., 2015, Bianchi et al., 2019, Liu et al., 2017).

Vertex-domain (time-domain) implementations exist via recursions, such as for first-order ARMA: $N$ 2 with steady-state solution yielding an ARMA frequency response (Loukas et al., 2015, Isufi et al., 2016).

2. Algorithmic Implementations and Distributed Realization

ARMA graph filters admit various vertex-domain implementations that support localized and distributed computation:

Parallel ARMA $N$ 3: Run $N$ 4 independent first-order recursions, each with different coefficients, sum the outputs. At each node, local states are exchanged with neighbors, and overall memory and communication scale as $N$ 5 per node (Loukas et al., 2015, Isufi et al., 2016).
Periodic ARMA $N$ 6: Use a single state vector updated with $N$ 7-periodically varying coefficients, reducing storage requirements. Both approaches guarantee geometric convergence under well-defined stability conditions (Loukas et al., 2015).
Graph Neural Networks (GNNs) with ARMA Layers: The ARMA filter is implemented as a stack of recursive graph convolutional-skip (GCS) blocks, blending non-linear propagations and skip (moving-average) connections. To approximate a Kth-order ARMA, multiple parallel stacks are run with subsequent averaging (Bianchi et al., 2019, Abburi et al., 17 Jan 2026). Every update only relies on one-hop neighbor aggregation, supporting scalability and transferability across graphs (Bianchi et al., 2019).
State Space and Adaptive Realizations: The GRAMA model generalizes the ARMA structure by integrating it as a state-space model with dynamic, data-adaptive AR/MA weights generated by cross-step attention, allowing permutation-equivariant and long-range dependency modeling in sequential and static graphs (Eliasof et al., 22 Jan 2025).

3. Filter Design, Identification, and Theoretical Properties

Designing ARMA graph filters involves the selection of coefficients $N$ 8 to closely approximate a prescribed spectral response $N$ 9:

Universal/Graph-Independent Coefficient Design: Coefficients are typically chosen without knowledge of the exact spectrum, solving constrained rational approximation problems over the relevant spectral interval. Stability (poles outside the unit disc or away from the spectrum) is enforced via convex or iterative constraints. Shanks’ method and variants are standard (Loukas et al., 2015, Isufi et al., 2016, Liu et al., 2017).
Prony-inspired and Iterative Methods: Prony-inspired procedures linearize the rational fitting problem, while iterative Steiglitz–McBride–style solvers minimize true mean-square error between $S$ 0 and the ARMA response. Projection and regularization are used to address numerical conditioning. Empirically, low-order ARMA fits attain 1–2 orders of magnitude lower error than comparable polynomial (FIR) designs (Liu et al., 2017).
Chebyshev-SOCP Optimization: Using Chebyshev polynomial parameterization ensures numerical stability. Weighted least-squares (WLS) fitting is converted into an iterative second-order cone programming (SOCP) problem, guaranteeing stability via explicit linear constraints (Pakiyarajah et al., 2021).

Design Method	Algorithmic Approach	Stability Enforcement
Shanks/Projection	Linearization & least squares	Spectrum mapping + constraints
Steiglitz–McBride	Iterative MSE minimization	Regularization, subspace search
Chebyshev-SOCP	Quadratic optimization, SOCP	Linear margin constraints

4. ARMA Graph Filters in Dynamic, Random, and Time-Vertex Settings

ARMA graph filtering extends naturally to dynamic contexts:

Time-Varying Signals/Graphs: For time-varying signals $S$ 1 and (possibly) dynamic graphs $S$ 2, ARMA filters act as joint 2D filters in both graph-frequency and temporal domains, with transfer function

$S$ 3

where $S$ 4 is a temporal frequency variable (Loukas et al., 2015, Isufi et al., 2016, Guneyi et al., 2023).

Random Graphs/Signals: With random time-varying graphs and/or inputs, stochastic analysis shows that the expectation $S$ 5 satisfies a deterministic ARMA recursion on the expected graph. Variance is bounded by the size of the graph fluctuations and the magnitude of AR/MA coefficients, with robustness increased for small coefficient norms and low graph variability (Isufi et al., 2017).
Time-Vertex ARMA: The process is modeled by a joint ARMA recursion over time and graph shifts, permitting analytic derivation and convex learning of the joint power spectral density. Missing value interpolation is performed via MMSE estimation based on the learned ARMA model, which converges as $S$ 6 in the number of samples (Guneyi et al., 2023).

5. Practical Performance, Robustness, and Comparison to FIR Graph Filters

ARMA graph filters exhibit distinct advantages over FIR (finite impulse response) filters:

Expressivity: Rational (ARMA) filters approximate sharp spectral transitions (e.g., step or window low-pass) with far fewer parameters and reduced Gibbs artifacts compared to polynomial FIRs, which require high order to match steep transitions (Loukas et al., 2015, Liu et al., 2017).
Adaptivity and Tracking: Running ARMA recursions until convergence, then continuing to update as $S$ 7 or $S$ 8 changes, allows continuous adaptation to signal and topology variations without repeated restarts, unlike FIRs (Loukas et al., 2015, Isufi et al., 2016, Isufi et al., 2017).
Noise and Perturbation Robustness: Inclusion of autoregressive terms ensures filter stability, improved conditioning, and resistance to noise or spectral mismatches. Empirical studies show ARMA filters maintain accuracy under node mobility, time-varying graphs, as well as perturbations, where FIR error grows substantially (Loukas et al., 2015, Bianchi et al., 2019).
Empirical Evidence: In both classical signal processing and modern GNN contexts, ARMA filters outperform polynomial designs in interpolation, denoising, and regression tasks, yielding lower error, faster convergence, and superior generalization (Bianchi et al., 2019, Liu et al., 2017, Isufi et al., 2016).

6. Applications and Scope in Graph Signal Processing and Deep Learning

ARMA graph filters are deployed across a range of theoretical and applied domains:

Signal Processing: Denoising, interpolation, compression, and prediction tasks on graphs exploit ARMA’s superior approximation and compression properties, admitting optimal or nearly-optimal solutions in contexts such as Tikhonov denoising and Wiener filtering (Isufi et al., 2016, Liu et al., 2017).
Graph Neural Networks: Convolutional GNN layers designed around ARMA recursions generalize polynomial-based GCNs and Chebyshev GNNs, providing flexible frequency-shaped filters that empirically improve node and graph classification, regression, and robustness to over-smoothing. Dynamic, adaptive ARMA-GNNs with attention for AR/MA coefficients (e.g., GRAMA) further enhance long-range dependency modeling and permutation equivariance (Bianchi et al., 2019, Abburi et al., 17 Jan 2026, Eliasof et al., 22 Jan 2025).
Time-Vertex Learning: ARMA filters learned jointly over time and graph axes enable robust interpolation and forecasting of time-varying graph signals, with proven accuracy and sample complexity guarantees (Guneyi et al., 2023).
Neuroimaging and Medicine: Hybrid frameworks such as ARMARecon fuse ARMA filtering with representation learning to classify neurodegenerative disease from dMRI data, leveraging ARMA’s ability to capture both local and global graph structure while regularizing for feature diversity and over-smoothing (Abburi et al., 17 Jan 2026).

7. Limitations, Extensions, and Current Research Directions

Designing ARMA graph filters involves several open and practical considerations:

Stability: The filter denominator must be checked for zeros across the relevant spectrum to avoid instability. Stability enforcement is integral to all modern design algorithms (Loukas et al., 2015, Pakiyarajah et al., 2021).
Nonconvexity: The design landscape is typically nonconvex; while iterative relaxations and projection methods succeed in practice, no general global optimality guarantees exist (Liu et al., 2017, Pakiyarajah et al., 2021).
Complexity: The distributed and recursive structure yields per-iteration and memory costs scaling with the order $S$ 9 and local graph degree, but offers efficient realization in large and dynamic networks (Isufi et al., 2016).
Model Selection: Order selection (number of AR/MA terms) impacts expressivity and complexity. Over-parameterization risks instability, while under-parameterization impairs accuracy (Liu et al., 2017).
Generalizations: Recent research explores adaptive, data-dependent ARMA coefficients via attention, permutation-invariant state-space extensions, and joint space-time ARMA for non-stationary and higher-order dependency modeling (Eliasof et al., 22 Jan 2025, Guneyi et al., 2023).

Ongoing work aims to further improve learning procedures on unknown or streaming graphs, optimize for hardware efficiency, and integrate ARMA filtering seamlessly with deep learning architectures for scalable, robust graph representation learning.