Learned Physical Nonlinearity

Updated 23 January 2026

Learned Physical Nonlinearity is the adaptive tuning of intrinsic nonlinear responses in physical systems, enabling optimized device efficiency and computational performance.
It leverages native nonlinearities in photonic, electronic, and mechanical substrates through trainable models and physics-informed learning rules.
Applications include enhanced optical communication, energy-efficient analog computing, and accurate physical system modeling with reduced device complexity.

Learned physical nonlinearity refers to the process by which nonlinear behaviors rooted in physical systems—such as those arising from material properties, device transfer characteristics, dynamical coupling, or engineered interactions—are not merely imposed or simulated, but are adaptively tuned or exploited within a learning framework. Unlike traditional neural networks with fixed mathematical activation functions, learned physical nonlinearity leverages native nonlinear responses of hardware or physical models, and directly learns to optimize and control these responses for improved computational performance, device efficiency, or physical modeling accuracy. This concept spans photonic, electronic, mechanical, optical, and spatiotemporal dynamical systems. Recent literature demonstrates both algorithmic and hardware-based embodiments of learned physical nonlinearity that outperform conventional architectures in efficiency and expressivity.

1. Principles and Formalization of Learned Physical Nonlinearity

Learned physical nonlinearity distinguishes itself from fixed nonlinearities by making the nonlinear response—often intrinsic to underlying hardware or physics—an object of training or adaptation. In Kolmogorov–Arnold Network (KAN) architectures, the nonlinear “synaptic” transfer function $\phi_{ij}(x)$ is trainable, realized as a weighted sum of device-level transfer curves (e.g., by tuning voltages in a silicon-on-insulator device), and directly optimized by gradient methods in combination with a digital twin (Taglietti et al., 20 Jan 2026). The Kolmogorov–Arnold theorem guarantees universal function approximation for multivariate continuous functions using sums of nonlinear transformations. Mathematically:

$f(x_1,\dots,x_n) = \sum_{i=1}^{2n+1} g_i\left( \sum_{j=1}^n \phi_{ij}(x_j) \right)$

Hardware implementations initialize device parameters (e.g., control voltages, gains), optimize via in-silico differentiation, and deploy the trained configuration for regression or classification tasks.

In optical and analog contexts, learned nonlinearity may arise through indirect means, such as voltage-controlled structural scattering (Xia et al., 24 Jun 2025) or element-wise polynomial products in physics-hard-encoded neural networks (Rao et al., 2021).

2. Physical Platforms and Engineering Mechanisms

Photonic and Optical Substrates

Recent advances show multivariate nonlinearities and high-order scattering interactions can be learned and controlled in analog photonic substrates. For instance, physical neural networks employing voltage-tuned liquid-crystal-polymer composites (LCPC) exploit “structural nonlinearity” generated by multi-bounce scattering, each configuration yielding a distinct, reconfigurable nonlinear transfer map $f(x;V)$ (Xia et al., 24 Jun 2025). This nonlinearity is amplifiable via ensembles, and operates at ultra-low optical powers and voltages.

In silicon photonics, Synaptic Nonlinear Elements (SYNEs) physically realize arbitrary nonlinear transfer characteristics (monotonic, saturating, negative differential resistance), tunable by external voltages. This direct control and learning of physical nonlinearity underpins the KAN approach and yields significantly more compact and expressively efficient learning systems versus weight-only networks (Taglietti et al., 20 Jan 2026).

Analog Electronic Networks

Analog networks using MOSFET-based nonlinear resistors with gate-controlled conductivity demonstrate that the very circuit elements’ transfer functions need not be idealized, but can be modulated, measured, and trained in a local learning process (Dillavou et al., 28 May 2025). Imperfections and deterministic bias in physical learning networks give rise to phenomena such as representational drift and limit cycles during task alternation.

Mechanical and Spatiotemporal Systems

Materials such as elastic networks with learned, nonlinear spring energies (power-law exponents $\xi$ in $E \sim |s|^\xi$ ) enable sequential learning and memory preservation of multiple stable states in mechanical systems (Stern et al., 2019). In spatiotemporal dynamical systems, hard-encoded physics-informed architectures (PeRCNN) utilize elementwise products of learned convolutional features to realize interpretable, polynomial nonlinearities, robust to data scarcity and noise (Rao et al., 2021).

3. Learning Schemes: Algorithmic Structures and Optimization

Model-Guided Neural Architectures

Machine-learning paradigms such as perturbation theory-aided digital back-propagation (PA-LDBP) for optical fiber nonlinearity compensation fuse analytical first-order perturbation models (derived from the nonlinear Schrödinger equation) with deep neural network architectures unfolded from split-step Fourier methods (Lin et al., 2021). Here, learned vectors of perturbation coefficients $c^{(e)}$ implement intra-channel cross-phase modulation as convolutional nonlinear layers within neural nets.

Perturbation-based nonlinearity compensation (PB-NLC) methods further demonstrate that learning linear or nonlinear combinations of analytic physics-derived “triplet” features yields optimal performance-complexity trade-offs in coherent optical communication (Luo et al., 2022, Luo et al., 2022).

Physics-Hard Encoding and Polynomial Nonlinearity

In PeRCNN, the central device for learned physical nonlinearity is the “ $\Pi$ -block”—an elementwise product of parallel convolutional feature maps. This approach allows the encoding (and learning) of polynomial nonlinearities up to order $n+1$ , ensuring that all nonlinear modeling is interpretable as field interactions and not as generic black-box activations (Rao et al., 2021).

Gradient and Local Update Rules

Physical learning networks, whether analog electrical or mechanical, apply local learning rules on device parameters (e.g. gate voltages, spring stiffness), driven by physical measurements and loss gradients extracted via frequency or contrastive signaling: e.g., frequency propagation in nonlinear resistive circuits utilizes two sinusoidal signals at different frequencies to read out activation and error signals, implementing gradient descent via local interactions (Anisetti et al., 2022). Overclamping and adaptive timing in analog resistor networks can mitigate bias-induced drift and precision limits (Dillavou et al., 28 May 2025).

4. Universality, Expressivity, and Physical Constraints

A foundational theorem establishes that universality in physical neural networks requires that the multivariate nonlinear encoding function $\sigma: \mathbb{R}^d \to \mathbb{R}$ is non-degenerate, i.e., that no nonzero partial derivative identically vanishes in any dimension (Savinson et al., 6 Sep 2025). This criterion governs the physical substrate selection and architectural choices (e.g., scatterers, modulators, control stages) and applies to plethora of domains—optical, electronic, elastic.

Quantitative metrics such as “epsilon expressivity” (a task-agnostic packing measure of hardware primitives’ nonlinear curve diversity) inform hardware selection and network design in physical KANs (Taglietti et al., 20 Jan 2026). Empirical results show that learning nonlinear device characteristics leads to up to two orders of magnitude fewer parameters and devices than conventional linear weight-based networks.

5. Applications Across Domains

Communication Systems

Learned physical nonlinearity has achieved state-of-the-art nonlinear equalization and compensation in optical fiber links, allowing recovery of nonlinear SNR and flexible computational trade-offs. PA-LDBP (perturbation analysis-guided DNN) and fully learned PB-NLC architectures supersede digital back-propagation in Q² gains and computational cost, especially at high transmission rates and multi-span links (Lin et al., 2021, Luo et al., 2022, Luo et al., 2022). Distributed equalization schemes leveraging learned nonlinearity also allow real-time adaptation to time-varying birefringence and PMD drift (Jain et al., 2022).

Physical and Analog Computation

Physical KANs trained on SYNE devices outperform software multilayer perceptrons (MLPs) for regression, classification, and temporal forecasting (e.g., Li-ion battery prognosis), at much lower energy and component counts (Taglietti et al., 20 Jan 2026). Ensemble optical learners with tunable LCPC scattering realize high-accuracy classification of MNIST/Fashion-MNIST/EMNIST at microwatt powers (Xia et al., 24 Jun 2025).

Physical System Modeling

Physics-hard-encoded neural structures reconstruct spatiotemporal nonlinear dynamics directly from scarce, noisy data, enabling accurate extrapolation and symbolic recovery of governing equations (e.g., Burgers, Gray-Scott reaction-diffusion) (Rao et al., 2021). Mechanical networks with non-linear elasticity sequentially acquire stable geometric states, preserving memory via sparse strain localization; results connect mechanical learning to sparse coding and regression (Stern et al., 2019). In plasma physics, symmetry-embedded learning approaches facilitate robust, frame-invariant closures for fluid-moment equations via data augmentation, outperforming analytic models and generalizing far beyond direct simulation (McGrae-Menge et al., 16 Jun 2025).

6. Complexity Reduction, Scalability, and Practical Considerations

Learned physical nonlinearity enables scalable, energy-efficient, and compact implementations. Pruning of learned perturbation coefficients, k-means quantization, and cyclic buffer techniques ensure that learned NLC architectures in communication remain tractable for real-time application (Lin et al., 2021, Luo et al., 2022). Hardware pruning strategies (e.g., SYNE device count per synapse in KANs) match device deployment to function complexity with minimal performance loss (Taglietti et al., 20 Jan 2026).

Temporal and spatial multiplexing schemes (e.g., stacking in photonic ring-resonator networks) dramatically enhance effective system sizes even in poorly scalable hardware (Savinson et al., 6 Sep 2025). System-agnostic training modifications (overclamping, adaptive timing) suppress bias-induced drift without digital modeling or hardware redesign in analog networks (Dillavou et al., 28 May 2025).

7. Contemporary Challenges and Best Practices

Learned physical nonlinearity is subject to inherent hardware imperfections, deterministic bias, drift, and device-to-device heterogeneity. Robustness demands system-aware initialization, symmetric data augmentation, explicit physical parameter management, and physics-informed architectural guidelines.

Best practices include:

Physics-aware digital twin training pipelines for hardware transferability (Taglietti et al., 20 Jan 2026).
Embedding symmetries via data augmentation for physical consistency and generalization in modeling (McGrae-Menge et al., 16 Jun 2025).
Hardware-aware metric development (“epsilon expressivity”) for platform selection.
Pruning and quantization to tune complexity at both device and algorithmic levels.

Continued research aims to unify nonlinearity learning for increasingly diverse physical substrates—memristive, photonic, spintronic—and to establish learned physical nonlinearity as a central primitive for next-generation energy-efficient, compact, and scalable learning architectures.