Papers
Topics
Authors
Recent
Search
2000 character limit reached

Fourier-Kolmogorov-Arnold Networks

Updated 17 January 2026
  • Fourier-KANs are neural architectures that combine the Kolmogorov-Arnold representation theorem with truncated Fourier series, enabling adaptive spectral expressivity.
  • They employ innovative techniques such as matrix association and learnable random Fourier features to reduce parameter complexity while capturing high-frequency signal components.
  • Their versatile applications in vision, language, audio, and scientific modeling demonstrate empirical gains in accuracy, efficiency, and interpretability.

Fourier Kolmogorov-Arnold Networks (Fourier-KAN) are a class of neural architectures that integrate the Kolmogorov–Arnold representation theorem with Fourier series or Random Fourier Features (RFF) to achieve adaptive spectral expressivity, parameter efficiency, and interpretability. They have been developed to overcome key limitations of vanilla Kolmogorov–Arnold Networks (KAN) such as parameter explosion and inability to capture high-frequency features in high-dimensional learning tasks. Fourier-KANs are now central in vision, language, audio signal representation, time-series anomaly detection, implicit neural representations, graph learning, scientific machine learning, and differentiable operator learning.

1. Mathematical Foundations and Theoretical Guarantees

Fourier-KANs merge the Kolmogorov–Arnold superposition principle, which states that any continuous multivariate function f:RdRf:\mathbb{R}^d\to\mathbb{R} can be written as sums and compositions of univariate functions, with the Fourier series theorem, which provides spectral bases for function approximation. A canonical Fourier-KAN layer replaces standard piecewise or spline basis functions by truncated Fourier expansions: f(x)=i=1dk=1K(ai,kcos(kxi)+bi,ksin(kxi)),f(x) = \sum_{i=1}^{d} \sum_{k=1}^{K} \left(a_{i,k} \cos(k x_i) + b_{i,k} \sin(k x_i)\right), where KK is the spectral order and {ai,k,bi,k}\{a_{i,k},b_{i,k}\} are learnable parameters per input dimension (Xu et al., 2024, Li et al., 2024). For high-dimensional applications, Fourier-KAN blocks feature further optimizations such as learnable RFF with bandwidth and phase initialization following kernel-theoretic principles (e.g., ωijN(0,σ2/d)\omega_{ij} \sim \mathcal{N}(0, \sigma^2/d) and biUniform[0,2π]b_i \sim \text{Uniform}[0,2\pi] (Zhang et al., 9 Feb 2025)).

The universality of Fourier-KANs is established for L2([0,2π]n)L^2([0,2\pi]^n), matching deep spline-KANs; smooth multivariate functions can be approximated arbitrarily well by a finite-depth network with Fourier-edge expansions (Li et al., 2024). Polynomial bounds on the parameter complexity are demonstrated for dual-domain Fourier-Kolmogorov–Arnold neural operators (KANO), showing a strict advantage over pure spectral models for dense or position-dependent operators (Lee et al., 20 Sep 2025).

2. Architectural Innovations and Parameter Efficiency

Fourier-KAN architectures address the parameter explosion in standard KANs via matrix association, low-rank spectral projections, and hybrid activation schemes. In Kolmogorov-Arnold-Fourier Networks (KAF) (Zhang et al., 9 Feb 2025):

  • The dual-matrix structure of classic KAN layers (WAϕ(x)+WBψ(x)W_A \cdot \phi(x) + W_B \cdot \psi(x)) is collapsed via matrix association into (WA+WB)(aϕ(x)+bψ(x))(W_A + W_B) \cdot (a \cdot \phi(x) + b \cdot \psi(x)) with element-wise learnable scaling. This reduces the per-layer parameter cost from din×dout×(G+K+3)+doutd_{in} \times d_{out} \times (G+K+3) + d_{out} to din×dout+doutd_{in} \times d_{out} + d_{out}.
  • Trainable RFF layers enable adaptive spectral embeddings, with analytically differentiable parameters (gradients for ω\omega, bb are computed via chain rule).
  • Adaptive hybrid activations H(x)=aGELU(x)+bRFF(x)H(x) = a \odot \text{GELU}(x) + b \odot \text{RFF}(x) begin training with low-frequency bias (small bb) and shift toward high-frequency coverage as bb grows, modulating the response spectrum during optimization.

Projective-KANs (P-KANs) further compress KANs by entropy-driven projection to Fourier bases, using sparsity-inducing penalties and gravitational regularization to encourage edge functions to converge to low-dimensional Fourier expansions (Poole et al., 24 Sep 2025). Empirically, this achieves up to 80% parameter reduction per edge while maintaining representational capacity, with stable training under noise.

3. Spectral Adaptivity and Expressivity

Fourier-KANs directly learn the activation frequency content necessary for each task. Unlike fixed positional encodings or global periodic activation networks, Fourier-KANs use learnable or data-adaptive spectral representations:

  • In KAF, learnable RFF frequencies and phases allow the network to capture the exact bandwidth needed for the signal (Zhang et al., 9 Feb 2025).
  • In Implicit Neural Representation contexts, first-layer activations have fully adaptive Fourier coefficients, serving as a spectral filter bank that can allocate power to arbitrary bands based on target reconstruction error (Mehrabian et al., 2024).
  • For audio and time series, frequency-adaptive learning strategies such as an inverted-pyramid mode assignment (layerwise decreasing spectral capacity) and frequency-aware weight initialization ensure rapid convergence across all frequency bands and mitigate spectral bias (Li et al., 10 Jan 2026, Zhou et al., 2024). Fourier-KANs are robust to high-frequency and low-frequency content without hyperparameter sensitivity.

Global support of Fourier bases enhances expressivity for smooth and periodic functions; however, Gibbs phenomena can affect approximation near discontinuities, which is addressed via hybridization or alternative local basis functions (Noorizadegan et al., 28 Oct 2025).

4. Applications Across Domains

Fourier-KANs have been applied in a wide array of domains, consistently achieving empirical gains over standard MLPs, spline-based KANs, and pure spectral models:

  • Computer Vision: KAF outperforms MLP, KAN, and kernel-based alternatives on MNIST, CIFAR, and ImageNet under tight parameter budgets (e.g., CIFAR-10 at 1.5×1051.5 \times 10^5 params, KAF 91.8%91.8\% vs. MLP 91.2%91.2\%, FAN 90.7%90.7\%) (Zhang et al., 9 Feb 2025).
  • Language and Audio: For NLP tasks (CoLA, AG_NEWS) and audio processing (SpeechCommand, UrbanSound8K), Fourier-KANs deliver 1–3 point improvements in accuracy and faster convergence (Zhang et al., 9 Feb 2025). In audio, they offer comparable SNR to carefully tuned positional encoding MLPs but without hyperparameter sensitivity (Li et al., 10 Jan 2026).
  • Implicit Representations and Signal Modeling: In INR for images and 3D shapes, learnable Fourier-layer activations in FKAN improve PSNR/SSIM and IoU over state-of-the-art baselines, converging to fine textures and boundaries more rapidly (Mehrabian et al., 2024).
  • Graph and Molecular Learning: Fourier-KAN modules improve representation power and trainability in collaborative filtering (FourierKAN-GCF: Recall@20 improvement from 0.3307 to 0.3564 on MOOC (Xu et al., 2024)) and molecular property prediction (KA-GNN: ROC-AUC, e.g., BACE $0.890$ vs. $0.873$ SMPT (Li et al., 2024)).
  • Operator Learning and Scientific Modeling: Dual-domain KANO architectures remain expressive over generic position-dependent PDE operators, outperforming Fourier Neural Operator (FNO) approaches and reconstructing symbolic Hamiltonians in quantum mechanics to four-decimal precision (Lee et al., 20 Sep 2025).
  • Time Series Anomaly Detection: KAN-AD and Fourier-KAN-Mamba exploit global Fourier bases for robust, lightweight, and fast anomaly detection, with parameter counts <1000<1000 and 15%15\%+ Event F1 improvements over prior art (Zhou et al., 2024, Wang et al., 19 Nov 2025).

5. Training Procedures, Regularization, and Best Practices

Fourier-KANs employ standard optimization protocols—typically Adam with layer-wise learning rate schedules and early stopping. Regularization schemes span:

Ablation studies across works show that both hybrid activations (e.g., GELU+RFF) and theoretically-guided spectral initializations are essential for high accuracy and generalization. In practice, Fourier-KANs demonstrate negligible hyperparameter sensitivity compared to coordinate-based positional encoding MLPs.

6. Interpretability, Limitations, and Future Directions

Fourier-KANs naturally enable interpretable functional discovery. Learned edge functions are directly analyzable—the magnitude of Fourier coefficients identifies dominant frequencies. In industrial applications (e.g., automated fiber placement), the model automatically discovered that low-frequency modes explained over 95% of behavior (Poole et al., 24 Sep 2025). Symbolic freezing in KANO architectures recovers exact PDE operator coefficients to high precision with minimal parameterization (Lee et al., 20 Sep 2025).

Limitations remain in scaling Fourier-KANs to highly discontinuous functions (Gibbs effect), computational overhead for entropy-driven projection, and efficient handling of massive graphs or high-order spectral expansions. Ongoing research targets hybrid basis integration, sparse and multi-resolution spectral schemes, inverse problem settings, and domain-specific Fourier-KAN compositions (Noorizadegan et al., 28 Oct 2025, Mehrabian et al., 2024).

7. Representative Empirical Results

Model Domain Param Count Main Metric Best Baseline Fourier-KAN Value Δ (improvement)
KAF (CIFAR-10) Vision 1.5×1051.5\times10^5 Accuracy MLP (91.2%) 91.8% +0.6%
FKAN (Kodak) INR (Images) -- PSNR, SSIM INCODE (34.81,0.889) 37.91, 0.939 +8.91% PSNR
FourierKAN-GCF Recommender -- Recall@20 LightGCN (0.3307) 0.3564 +7.7%
KAN-AD (UCR) Time Series 274 Event F1 KAN (0.4120) 0.5335 +12.15%
KA-GNN (BACE) Molecular 44,000 ROC-AUC SMPT (0.873) 0.890 +0.017
Fourier-KAN-Mamba TS Anomaly -- F1 (MSL) Linear (90.99%) 93.27% +2.28%

These results confirm strong empirical performance in both standard regression/classification tasks and scientific applications (Zhang et al., 9 Feb 2025, Mehrabian et al., 2024, Zhou et al., 2024, Li et al., 2024, Wang et al., 19 Nov 2025).


Fourier-Kolmogorov-Arnold Networks constitute a general family of neural architectures that combine the theoretical foundation of functional superpositions with adaptive spectral representation, providing a widely validated framework for efficient, interpretable, and spectrally expressive learning in high-dimensional, complex signal domains.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Fourier Kolmogorov-Arnold Networks (Fourier-KAN).