Papers
Topics
Authors
Recent
Search
2000 character limit reached

KAN-We-Flow: Adaptive Spline-Based Flow Models

Updated 8 February 2026
  • KAN-We-Flow is a class of models that replaces fixed neural activations with adaptive Kolmogorov-Arnold operators for universal approximation in flow-related tasks.
  • It integrates spline-based nonlinearity within architectures like GCNs, RWKV modules, and heterogeneous GNNs to enhance performance in traffic, robotics, and network applications.
  • The framework offers improved parameter efficiency, interpretability through symbolic surrogate extraction, and robust resilience under noisy or disrupted dynamics.

KAN-We-Flow refers to a class of models creating highly expressive, robust, and parameter-efficient predictors for diverse flow-related tasks by systematically replacing classical neural activation functions with Kolmogorov-Arnold Network (KAN) operators. Leveraging the constructive form of the Kolmogorov–Arnold representation theorem, KAN-We-Flow variants have been instantiated in spatiotemporal traffic optimization, 3D robotic policy design, electrohydrodynamic pump modeling, and symbolic graph surrogates. The unifying theme is embedding adaptive spline-based one-dimensional function approximators in lieu of fixed nonlinearity, resulting in strong universal approximation, interpretability, and improved resilience under noisy/disrupted dynamics (Peng et al., 2024, Zhang et al., 5 Mar 2025, Chen et al., 1 Feb 2026, Marouani et al., 24 Dec 2025).

1. Foundational Principles: Kolmogorov–Arnold Networks

All KAN-We-Flow models are mathematically grounded in the Kolmogorov–Arnold theorem, which states that any continuous function f:[0,1]nRf:[0,1]^n \to \mathbb{R} can be decomposed as

f(x1,...,xn)=q=12n+1Φq ⁣(p=1nϕq,p(xp)),f(x_1, ..., x_n) = \sum_{q=1}^{2n+1} \Phi_q\!\left( \sum_{p=1}^n \phi_{q,p}(x_p) \right),

with Φq\Phi_q and ϕq,p\phi_{q,p} univariate continuous maps. In practice, a KAN layer of input dimension nn and output mm stacks mm sums across QQ learned univariate spline-parameterized projections, optionally composed with linear transforms. This decomposition underpins the KAN block designs, offering theoretical universal approximation guarantees superior to fixed-nonlinearity MLPs at a lower parameter cost (Peng et al., 2024, Marouani et al., 24 Dec 2025).

2. Model Architectures and Integrations

2.1 Traffic Flow: GCN-KAN Hybrid (TrafficKAN-GCN)

"TrafficKAN-GCN"—here referred to as KAN-We-Flow—integrates a KAN activation block into each Graph Convolutional Network (GCN) layer. GCNs model urban transport as weighted graphs G=(V,E,A)G=(V,E,A), with edge weights wijw_{ij} as functions of physical and historical quantities: length, speed limit, congestion, and travel time. Each graph convolution layer is of the form

H(l+1)=KAN(A~H(l)W(l)),H^{(l+1)} = \mathrm{KAN}(\tilde{A} H^{(l)} W^{(l)}),

where KAN\mathrm{KAN} replaces ReLU or other fixed nonlinearities. The KAN module consists of a stack of mm composed univariate splines fed into gi(jϕj([A~H(l)W(l)]:j))g_i\left(\sum_j \phi_j([\tilde{A} H^{(l)} W^{(l)}]_{:j})\right) for each output channel. This approach captures nonseparable, highly nonlinear spatial dependencies in traffic (Zhang et al., 5 Mar 2025).

2.2 3D Robotic Manipulation: RWKV-KAN Flow-Matching

For 3D robotic manipulation ("KAN-We-Flow" (Chen et al., 1 Feb 2026)), the UNet backbone is supplanted by a sequence of RWKV-KAN blocks. An RWKV module performs sequential time and channel mixing, followed by GroupKAN: each channel group is mapped through group-specific KAN layers applying learnable splines, both element-wise and group-wise. Channel Affinity Modulation (CAM) gates output groups via statistics on the feature-wise means. The output calibration is performed by composing these KAN layers with the time/channel-mixed latent sequences to produce action policies in a flow-matching conditional consistency framework.

2.3 Communication Network Delay: Heterogeneous GNNs with KAN (FlowKANet)

FlowKANet replaces all internal multilayer perceptrons in a heterogeneous GNN (message passing, attention, fusion, readout) with KAN blocks, yielding spline-based attention and transformation operators. The architecture uses bipartite flow–link graphs, with embeddings exchanged and updated via KAN-augmented message passing (KAMP-Attn). After training, these spline-based blocks can be symbolically regressed to closed-form algebraic surrogates retaining the original graph structure (Marouani et al., 24 Dec 2025).

3. Training Objectives, Losses, and Algorithms

3.1 Multi-task Losses

In all KAN-We-Flow variants, the core predictive objective is typically Mean Squared Error (MSE) on node-level or sample-level targets, i.e.,

Lpred=1Ni=1N(yiy^i)2.\mathcal{L}_\mathrm{pred} = \frac{1}{N}\sum_{i=1}^N (y_i - \hat y_i)^2.

For GCN integrations, a smoothness regularizer over embeddings,

Lgraph=12i,jAijHi(L)Hj(L)2,\mathcal{L}_\mathrm{graph} = \frac{1}{2}\sum_{i,j}A_{ij} \| H_i^{(L)} - H_j^{(L)} \|^2,

incentivizes local similarity.

3.2 3D Policy Flow-Matching and Consistency

KAN-We-Flow for 3D flow-matching (Chen et al., 1 Feb 2026) adopts a conditional consistency flow-matching loss:

  • Learn vθ(at,t,s,v)ddtatv_\theta(a_t, t, s, v) \approx \frac{d}{dt} a_t to transport Gaussian noise to expert actions.
  • Use endpoint and velocity consistency regularization with exponential moving average parameter targets.
  • Add Action Consistency Regularization (ACR), penalizing trajectory mismatch on a target horizon, for stabilization.

3.3 Routing and Flow Redistribution

For traffic, a flow redistribution algorithm post-prediction iteratively re-routes traffic away from congested/collapsed edges by

  • Removing any edge (i,j)(i,j) exceeding capacity thresholds.
  • Recomputing shortest paths under current network topology.
  • Pushing excess demand onto alternative paths, ensuring network-wide adaptation after disruptions such as bridge collapses (Zhang et al., 5 Mar 2025).

4. Empirical Performance and Benchmarking

4.1 Traffic Flow Prediction and Optimization

KAN-We-Flow matches MLP-GCN and Transformer baselines in accuracy, e.g., test MAE 3.61 (KAN-We-Flow) vs. 3.52 (MLP-GCN), but displays superior robustness under noise or disruption:

  • Under 20% input noise: RMSE up by 4.2% (KAN-We-Flow) vs. 7.8% (MLP-GCN) and 12.1% (GCN).
  • Bridge-collapse scenario (+10% edge removal): MSE jump +6.5% (KAN-We-Flow), compared to +11.2% (MLP-GCN) (Zhang et al., 5 Mar 2025).

4.2 3D Robotic Policy Success

KAN-We-Flow (33.6M parameters) achieves:

  • 100% success on Adroit-Hammer, 83% Door, 68% Pen.
  • Comparable or higher overall rates on Meta-World and DexArt compared to diffusion or flow-matching UNet (DP3) models, with 86.8% fewer parameters and 10×–14× lower inference latency (∼8–11 ms/control step).
  • Ablation: combining RWKV, GroupKAN, and ACR is necessary for optimal stability and accuracy (Chen et al., 1 Feb 2026).

4.3 Symbolic Surrogates and GNN Efficiency

FlowKANet attains near-baseline GNN MSE (40.81 vs. 38.64), with parameter counts reduced fivefold. Symbolic surrogates exhibit an expected tradeoff in accuracy (MSE 54.86) but enable instant evaluation and total algebraic transparency (Marouani et al., 24 Dec 2025).

5. Model Interpretability and Symbolic Extraction

A core benefit of KAN-We-Flow is the ability to extract human-readable, closed-form symbolic surrogates, either for regression tasks (e.g., EHD pump flow modeling) or for entire graph-structured pipelines (FlowKANet):

For example, in EHD pump modeling the flow rate formula is

Y2=1.701.59tanh(22.4(0.9x4)43.33sin(6.2x32.35)+0.082.11e1.72x5+2.13e0.24x20.89e1.40x1).Y_2 = 1.70 - 1.59 \tanh\Bigl(22.4\,(0.9-x_4)^4 - 3.33\,\sin(6.2\,x_3 - 2.35) + 0.08 - 2.11\,e^{-1.72\,x_5} + 2.13\,e^{-0.24\,x_2} - 0.89\,e^{-1.40\,x_1}\Bigr).

In FlowKANet, each block is regressed to algebraic expressions such as 0.12xf,10.47log(xf,4+1)+0.35tanh(xf,7)0.12 x_{f,1} - 0.47 \log(x_{f,4}+1) + 0.35\tanh(x_{f,7}), preserving the graph’s aggregation structure. This enables domain experts to dissect functional dependencies and assess trustworthiness (Peng et al., 2024, Marouani et al., 24 Dec 2025).

6. Computational Considerations and Scalability

Replacing fixed nonlinearities with KAN inflates per-layer parameter counts by O(Nms)O(Nms) for a graph with NN nodes, mm output channels, and spline order ss. TrafficKAN-GCN observed a ∼70% increase in training time over standard GCNs, but inference remains real-time (<50 ms per 3400 edges). Compression via pruning, knowledge distillation, or function table surrogates is proposed for scaling to 10510^5-edge networks. 3D robotics policies run at 100 Hz on commodity GPUs with one-seventh the typical diffusion model parameter count (Chen et al., 1 Feb 2026, Zhang et al., 5 Mar 2025).

7. Impact, Generalization, and Future Prospects

KAN-We-Flow advances parameter efficiency, interpretability, and robustness across disparate flow prediction and optimization domains. Its flexible integration with both spatial (GCN, GNN) and temporal (RWKV, flow-matching) backbones, combined with symbolic distillation, opens pathways for transparent, certifiable surrogate modeling as well as low-latency intelligent control under non-stationary or catastrophic scenarios. Anticipated extensions include GCN–Transformer hybrids, continual online calibration, and deployment of symbolic surrogates in model-predictive or real-time feedback settings (Zhang et al., 5 Mar 2025, Marouani et al., 24 Dec 2025).


References:

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to KAN-We-Flow.