Equilibrium Propagation in Neural Networks

Updated 2 February 2026

Equilibrium Propagation is a learning framework that trains energy-based models by relaxing network states under free and nudged conditions using local updates.
It estimates gradients by contrasting system states with slight output perturbations, achieving equivalence to backpropagation under symmetric conditions.
EP extends to deep, convolutional, spiking, and neuromorphic architectures, offering robust, energy-efficient learning even in noisy hardware environments.

Equilibrium Propagation (EP) is a framework for training energy-based models (EBMs) that leverages local, biologically inspired learning rules via the relaxation of network dynamics to steady states under both free and weakly clamped (nudged) conditions. EP is distinguished by its locality in space and compatibility with physical, neuro-inspired, and neuromorphic substrates. Originally developed for static-input, symmetric convergent recurrent neural networks (RNNs), EP's scope has expanded to encompass deep feedforward architectures, convolutional networks, spiking neural networks (SNNs), quantum systems, and dynamical physical media. The core principle is to use the internal physics of the system for both inference and credit assignment by contrasting system states under small output perturbations, thereby computing a gradient-like update using only local information.

1. Mathematical Foundations and Learning Dynamics

EP operates on systems described by an energy function $E(s, \theta, x)$ , with network state $s$ , parameters $\theta$ , and static input $x$ . Training comprises two sequential phases:

Free phase ( $\beta = 0$ ): The system relaxes to a free fixed point $s^0 = \arg\min_s E(s, \theta, x)$ under input $x$ .
Nudged phase ( $\beta > 0$ ): A cost or loss term $C(s, y)$ encoding supervision is softly imposed with a small "nudging" strength $\beta$ , yielding an augmented energy $E_\beta(s, \theta, x, y) = E(s, \theta, x) + \beta C(s, y)$ . The network relaxes to a new steady state $s^\beta$ .

The update rule is a finite-difference estimate of the parameter gradient: $\Delta\theta = -\frac{\eta}{\beta}\left[ \partial_\theta E(s^\beta, \theta, x) - \partial_\theta E(s^0, \theta, x) \right]$ with $\eta$ the learning rate. Under mild regularity, the update is an unbiased estimator of the gradient of the loss with respect to $\theta$ in the limit $\beta \to 0$ (Laborieux et al., 2020, Ernoult et al., 2019).

Discrete-time and continual (C-EP) formulations have been derived, enabling both efficient simulation and synapse dynamics local in space and time (Ernoult et al., 2020, Ernoult et al., 2020).

2. Equivalence to Backpropagation Through Time and Gradient Estimation Bias

EP's gradient updates are theoretically equivalent to those computed via Backpropagation Through Time (BPTT) under symmetric, energy-derived transition functions and sufficiently long relaxation, both in the continuous and discrete-time regimes (Ernoult et al., 2019, Ernoult et al., 2020). For deep and convolutional architectures, under the “primitive function” formulation $\Phi(x, s, \theta)$ , per-step EP updates coincide with BPTT gradients.

However, a bias emerges at finite $\beta$ (nudging strength): one-sided finite differences introduce an $O(\beta)$ error. Symmetric (central-difference) estimators, implemented via three-phase EP (+ $\beta$ , 0, $-\beta$ ), eliminate this bias to $O(\beta^2)$ , enabling EP to scale reliably to deep convnets—yielding test errors on CIFAR-10 within ~1% of BPTT (Laborieux et al., 2021, Laborieux et al., 2020).

$\widehat{\nabla}^{\mathrm{EP}}_{\mathrm{sym}}(\beta) = -\frac{1}{2\beta}\left[ \partial_\theta E(s^{+\beta}, \theta, x) - \partial_\theta E(s^{-\beta}, \theta, x) \right]$

3. Extensions: Hardware Considerations, Asymmetry, and Noise Robustness

EP’s spatial locality renders it suitable for implementation in neuromorphic and analog hardware—every synaptic update is a function only of pre- and post-synaptic activities pre- and post-nudge, with no need for global error signals or dedicated backward paths (Peters et al., 28 Mar 2025, Gower, 14 Oct 2025). EP is tolerant of quantization and hardware noise; in fact, moderate measurement uncertainty or induced stochasticity can regularize learning, enlarge attractor basins, and improve convergence (Peters et al., 28 Mar 2025, Lin et al., 14 Nov 2025, Gower, 14 Oct 2025).

Weight asymmetry, i.e., nonidentical forward and backward connectivity, introduces a second independent source of bias due to misalignment of Jacobians. This asymmetry can be mitigated without explicit weight tying, for example by adding a “Jacobian homeostasis” regularizer that penalizes the skew-symmetric part of the network's Jacobian at equilibrium, restoring nearly full data efficiency and gradient alignment even on large-scale tasks (e.g., ImageNet 32x32) (Laborieux et al., 2023).

Advances in continual EP (C-EP) allow immediate, temporally local weight updates during the nudged phase (Ernoult et al., 2020, Ernoult et al., 2020), further aligning with biological plausibility and increasing practicality for event-driven hardware.

4. Advanced Algorithmic Developments: Scalability, Finite-Nudge Theory, and Variational Formulations

Standard EP suffers from vanishing credit signals in deep architectures due to exponential decay of state differences with depth. Recent extensions introduce intermediate error signals—layerwise local error projections or knowledge distillation targets—directly into the nudged phase dynamics, preserving information flow and enabling scalable EP on VGG-12 architectures with state-of-the-art performance on CIFAR-100, and reducing GPU memory demands by over an order of magnitude compared to BPTT (Lin et al., 21 Aug 2025).

A finite-nudge foundation for EP has also been rigorously established: by modeling states as Gibbs–Boltzmann distributions (rather than deterministic points), the gradient of a stochastic contrastive objective (Helmholtz free energy difference) is exactly the expected energy-gradient difference under the free versus nudged distributions, without requiring infinitesimal nudging, convexity, or symmetry. This directly validates the classic Contrastive Hebbian Learning rule for arbitrary finite nudges, and allows EP to function with strong error signals, delivering high signal-to-noise parameter updates even when traditional infinitesimal- $\beta$ EP stalls (Litman, 27 Nov 2025).

$\nabla_\theta J(\theta) = \mathbb{E}_{s \sim p_1}[\nabla_\theta E(\theta, s)] - \mathbb{E}_{s \sim p_0}[\nabla_\theta E(\theta, s)]$

Path-integral generalizations frame the learning rule as an integral over loss–energy covariances along the $\beta$ -path.

5. Extensions to Novel Substrates: Spiking, Quantum, and Dynamical Physical Media

EP has been extended to spiking neural networks (SNNs) (Lin et al., 14 Nov 2025, Lin et al., 2024), quantum systems (Wanjura et al., 2024, Scellier, 2024), oscillator Ising machines (Gower, 14 Oct 2025), nonlinear wave media (Sajnok et al., 17 Oct 2025), and time-varying inputs (Pourcel et al., 6 Jun 2025). In spiking models, stochastic equilibrium propagation (StochEP) employs stochastic neuron models, improving training stability, biological realism, and scalability to deep convolutional SNNs. Mean-field theory guarantees that StochEP's expected optimization trajectory matches deterministic EP (Lin et al., 14 Nov 2025).

Quantum EP (QEP) employs Onsager reciprocity, training quantum Hamiltonians by contrasting energy derivatives in ground/thermal states under free and output-perturbed Hamiltonians. This approach enables gradient-based learning in quantum simulators via two equilibrium (free/nudged) phases and is applicable to phase classification, quantum sensing, and variational optimization (Wanjura et al., 2024, Scellier, 2024).

Hamiltonian Echo Learning (HEL), derived as a special case of Lagrangian-based Generalized EP (GLEP), extends EP to time-varying inputs and fully dynamical systems, operating as a "forward-only" local-learning protocol compatible with reversible physical substrates (Pourcel et al., 6 Jun 2025).

6. Practical Performance, Limitations, and Biomimetic Potential

EP- and C-EP-trained models have achieved accuracy within 1%–3% of BPTT/BP on MNIST, FashionMNIST, CIFAR-10, and even ImageNet 32x32 benchmarks in both symmetric and (with regularization) asymmetric architectures (Laborieux et al., 2020, Laborieux et al., 2021, Laborieux et al., 2023, Lin et al., 21 Aug 2025, Laborieux et al., 2022). EP uniquely offers a single physical dynamical substrate for inference, gradient propagation, and learning, contrasting with the separate forward and backward passes in BP. Experimental implementations on oscillator Ising machines demonstrate both learning efficacy and energy efficiency at GHz timescales, with high robustness to quantization and hardware-level noise (Gower, 14 Oct 2025).

However, limitations remain: scaling unaugmented EP to very deep or highly nonconvex architectures is challenging due to vanishing signals, finite- $\beta$ bias, and the need for careful hyperparameter tuning (nudging strength, learning rate, regularization). Per-sample runtime is typically higher than BP due to the requirement for multiple relaxation phases, and real-time or online deployment in resource-constrained neuromorphic hardware is limited by convergence speed and dynamic state retention. Fully end-to-end trainability with dynamic (non-static) input remains an active research direction (Pourcel et al., 6 Jun 2025).

7. Outlook: Unification, Impact, and Future Directions

Equilibrium Propagation has unified local, physics-driven learning rules across a spectrum of architectures, from analog Hopfield networks to quantum many-body systems and nonlinear physical media. Key recent advances—finite-nudge theory, scalable intermediate-loss mechanisms, Jacobian homeostasis, and continuous-time/oscillatory learning—enable EP to match the performance and flexibility of gradient backpropagation on large-scale tasks, while retaining spatial and temporal locality. EP is uniquely positioned as a candidate algorithm for future neuromorphic and physical AI systems, simultaneously addressing the credit assignment problem, memory efficiency, and the hardware–algorithm gap in learning.

Ongoing research addresses the formulation of EP-compatible “forward-only” learning for arbitrary dynamical, time-varying, or quantum substrates, robust real-time deployment in constrained hardware, hybrid integration with attention and memory modules, and the bridging of theoretical physics, biology, and machine learning. The continued development of EP-based methods promises further expansion of trainable physical and quantum information processing technologies, with strict locality, energy efficiency, and hardware/fabric compatibility (Litman, 27 Nov 2025, Pourcel et al., 6 Jun 2025, Wanjura et al., 2024, Sajnok et al., 17 Oct 2025, Gower, 14 Oct 2025, Lin et al., 21 Aug 2025).