Spiking State-Space Models

Updated 7 January 2026

Spiking SSMs are neural architectures that embed higher-dimensional latent state dynamics with discrete spiking nonlinearity, merging insights from control theory and computational neuroscience.
They employ efficient parallel computation schemes, such as FFT-based convolutions and surrogate gradients, to handle long sequences and reduce computational complexity.
These models offer significant energy savings and scalability, making them ideal for neuromorphic implementations and edge applications in long-range temporal tasks.

Spiking State-Space Models (SSMs) generalize the canonical spiking neuron by embedding its dynamics within a trainable linear state-space system, followed by a discrete spiking nonlinearity. Unlike conventional SNNs—where each neuron maintains a scalar membrane potential—SSM-based spiking architectures leverage higher-dimensional latent states, structured recurrences, and parallel-update schemes, integrating core methods from control theory, computational neuroscience, and deep sequence modeling. Recent advances systematically reinterpret long-memory neural sequence models via the lens of SSMs, yielding architectures that combine sparse, event-driven computation with high-capacity, neuromorphic-friendly memory mechanisms suitable for long-range temporal tasks, sequence modeling, and energy-constrained applications.

1. Mathematical Formulations and Model Variants

Spiking SSMs are formulated as discrete-time or continuous-time state-space systems augmented with spiking nonlinearity. The general state update and output equations are:

Continuous-time SSM:

$\frac{dh(t)}{dt} = A h(t) + B x(t), \quad y(t) = C h(t)$

where $A$ , $B$ , $C$ are trainable matrices, $h(t)$ is the hidden state, and $x(t)$ is the input (often a binary spike).

Discretization (Zero-Order Hold or Bilinear):

$h_t = \overline{A} h_{t-1} + \overline{B} x_t, \quad y_t = \overline{C} h_t$

with $\overline{A}$ , $\overline{B}$ , $\overline{C}$ given by analytic formulas depending on integration step $\Delta$ (Zhong et al., 2024, Shen et al., 2024, Fabre et al., 4 Jun 2025).

Core Spiking Rule (Leaky Integrate-and-Fire with reset/refractory):

$u_t = \tau u_{t-1} + y_t - R_t U_{\mathrm{th}}, \quad s_t = H(u_t - v_{\mathrm{th}})$

where $u_t$ is the membrane, $\tau$ is the leak factor, $R_t$ is the refractory state, $U_{\mathrm{th}}$ is reset magnitude, $v_{\mathrm{th}}$ threshold, $H(\cdot)$ is the Heaviside function (Zhong et al., 2024, Shen et al., 2024).

Probabilistic extension: Sampling spikes from Bernoulli firing probability $p_t = \sigma(\overline{C} x_t)$ , with $S_t \sim \mathrm{Bernoulli}(p_t)$ (Bal et al., 2024).

Second-order formulations, e.g., SHaRe-SSM, further generalize the neuron as oscillatory systems:

$u'(t) = -\omega^2 v(t) - 2b u(t) + x(t); \quad v'(t) = u(t); \quad z(t) = \Theta(u(t) - \theta)$

enabling resonant (non-decaying or energy-conserving) long-memory (Agrawal et al., 16 Oct 2025, Fabre et al., 4 Jun 2025).

2. Computational Schemes and Parallelism

Event-driven neuronal dynamics traditionally lead to high computational costs for long sequences due to sequential spike integration. Spiking SSM frameworks incorporate several algorithmic strategies for tractable parallel computation:

FFT-based convolutional recursion: The output $y_t$ can be rendered as a time-domain convolution over inputs: $y_t = \sum_{k=1}^{t} C \overline{A}^{t-k} \overline{B} x_k$ . FFT reduces complexity from $O(L^2)$ to $O(L \log L)$ per sequence of length $L$ (Zhong et al., 2024, Shen et al., 2024).
Surrogate Dynamic Network (SDN): Predicts entire spike trains in parallel given input currents, scaling inputs to absorb trainable thresholds, and recovers training parallelism up to $O(L)$ or $O(L \log L)$ (Shen et al., 2024).
Parallel Max–min Boundary Compression (PMBC): Iteratively computes bounds on membrane resets in $O(M L \log L)$ via FFT for spiking long sequences, resolving $99\%$ of spikes with $M \ll L$ (Zhong et al., 2024).
Parallel scan implementation: SHaRe-SSM exploits associativity of state-space updates for cumulative product computation in $O(\log L)$ , enabling long contexts without vanishing/exploding gradients (Agrawal et al., 16 Oct 2025).
Selective Scanning: Processes only timesteps coinciding with spike events, reducing computation for sparse input streams, most notably in SpikySpace (Tang et al., 2 Jan 2026).

3. Hierarchical and Block Architectures

Spiking SSMs employ modular blocks integrating state-space memory with spiking nonlinearity:

Layered composition: Each block stacks SSM dynamics followed by LIF or resonate-and-fire activation, usually with trainable leak, reset, and refractory parameters (Zhong et al., 2024, Agrawal et al., 16 Oct 2025, Shen et al., 2024).
Feature mixing and residual connections: Many architectures introduce Conv1d+GLU or dense mixer blocks ("SpikeMixer") for intra-layer spike communications, followed by normalization and ClampFuse (residual fusion) layers to preserve long-range dependencies (Bal et al., 2024, Shen et al., 2024).
Multiple-input/multiple-output neurons: MIMO/SIMO neurons generalize classical SISO spiking units to project rich state vectors over multiple spike channels, improving communication and expressivity for fixed neuron budget (Karilanova et al., 3 Apr 2025).
Kernel-based spiking regression: Networks such as SHaRe-SSM convolve output spikes with trainable temporal kernels to realize smooth regression on extremely long sequences (Agrawal et al., 16 Oct 2025).

Trainable parameters often include SSM matrices $(A, B, C, D)$ , thresholds, leak factors, learned timescales (via log-parametrization), and mixer weights.

4. Learning Protocols and Surrogate Gradients

Spiking SSM training removes the bottleneck of non-differentiable spike emissions via smooth surrogates and probabilistic estimation:

Surrogate gradient techniques: Replace non-differentiable $H(u_t - v_{\mathrm{th}})$ by smooth boxcar, triangular, or custom kernel approximations, with derivative structures such as:

$g(x) = \begin{cases} 0 & \text{if } x < -1/\alpha \ -\frac12\alpha^2|x|x + \alpha x + \frac12 & |x| \le 1/\alpha \ 1 & x > 1/\alpha \end{cases}, \quad \alpha = 1$

(Zhong et al., 2024).

Probabilistic spike sampling: Bernoulli sampling via learned probability $p_t$ , with surrogate $dS/dp \approx 1$ for stable stochastic backpropagation (Bal et al., 2024).
Log-parametrization of timescales: Ensures positivity and numerical stability for leak/decay rates and $\Delta t$ , closely paralleling S4D/LinSSM best practices (Fabre et al., 4 Jun 2025).

Typical losses include cross-entropy for classification, MSE for regression, and $H_2$ -norm regularizers or frequency masking for anti-aliasing in event-based models (Zubić et al., 2024).

5. Performance, Sparsity, and Efficiency

Spiking SSMs substantially advance both efficiency and accuracy for long-sequence modeling and resource-constrained inference:

Sparsity: Spiking rates in recent models typically range from $4$- $12\%$ , yielding $90\%+$ sparsity and reducing synaptic operation count (Zhong et al., 2024, Shen et al., 2024, Bal et al., 2024).
Energy consumption: Switching from MAC to sparse ACC events saves up to $20$- $100\times$ energy; SpikySpace achieves $98.7\%$ reduction vs iTransformer, $96.2\%$ vs iSpikformer on Electricity (Tang et al., 2 Jan 2026). SHaRe-SSM reaches $73\times$ reduction over second-order ANN-SSMs (Agrawal et al., 16 Oct 2025).
Accuracy benchmarks: SpikingSSMs and SPikE-SSMs match or outperform Transformers and ANN-SSMs on Long Range Arena, WikiText-103, sequential MNIST, and various event-based vision and audio tasks, often with $1/3$ of the parameter count and superior sparsity (Zhong et al., 2024, Shen et al., 2024, Fabre et al., 4 Jun 2025, Tang et al., 2 Jan 2026, Zubić et al., 2024).
Scalability: Parallel and kernel-based computation allow deployment on sequences up to $50$k steps (SHaRe-SSM) with minimal degradation; learnable delays and MIMO architectures disproportionately benefit small/neuron-poor networks (Karilanova et al., 1 Dec 2025, Agrawal et al., 16 Oct 2025, Karilanova et al., 3 Apr 2025).
Event-based vision: SSMs with learned timescales generalize across inference rates and outperform Transformers/RNNs by limiting aliasing and speedup (Zubić et al., 2024).

6. Extensions: Delays, Higher-Order Dynamics, and Probabilistic Models

Recent work generalizes spiking SSMs in several directions:

Delay-augmented SSMs: Introduce additional state buffers per neuron to ingest past $n_d$ inputs, enabling explicit finite-order memory, compensating for small network sizes (Karilanova et al., 1 Dec 2025). The delay-following SSM is neutrally stable and interpretable; delay-weights may be learned.
Resonate-and-fire and complex SSMs: Second-order or complex state-space formulations yield resonant integrate-and-fire neurons suitable for non-decaying long-memory and kernel regression on extended time series (Agrawal et al., 16 Oct 2025, Fabre et al., 4 Jun 2025).
Moment-closure SSMs: Point-process GLMs may be rendered as infinite-dimensional SSMs and reduced via basis projection and moment closure, resulting in finite-dimensional SSM filters with explicit statistics of spike-train fluctuations and covariances (Rule et al., 2018).
Probabilistic SSMs: Stochastic spike-emission via Bernoulli, surrogate gradients, and parallel convolutional implementations; SpikeSampler and Mixer blocks enable scalable and interpretable probabilistic spiking networks (Bal et al., 2024).

7. Biological Interpretability and Implications for Neuromorphic Hardware

Spiking SSMs retain direct biological allegories, particularly in their dendrite-soma separation, refractory/soft-reset schemes, and event-driven integration:

LIF and resonate neurons closely emulate membrane integration, refractoriness, and subthreshold oscillatory phenomena (Zhong et al., 2024, Agrawal et al., 16 Oct 2025, Fabre et al., 4 Jun 2025).
Delay and kernel mechanisms parallel axonal delays and synaptic plasticity found in biological circuits (Karilanova et al., 1 Dec 2025, Rule et al., 2018).
Substantial energy savings, high sparsity, bit-shift-only updates, and absence of dense multiplications position spiking SSMs as foundational architectures for next-generation neuromorphic chips and scalable low-latency edge inference (Tang et al., 2 Jan 2026, Zhong et al., 2024, Shen et al., 2024).

A plausible implication is that the continued fusion of SSM principles with stochastic, multi-channel, and delay-rich spiking architectures will progressively close the accuracy gap with dense ANNs for sequence modeling, while preserving interpretability and efficiency suitable for real-world edge deployments.

References: