Papers
Topics
Authors
Recent
Search
2000 character limit reached

Frequency-Adaptive Non-uniform Compression (FANC)

Updated 7 February 2026
  • FANC is a technique that adaptively decomposes data using spectral transforms (e.g., DCT, wavelet) to achieve non-uniform, content-adaptive bit allocation.
  • It improves rate-distortion trade-offs by selectively encoding frequency, spatial, or temporal features, preserving critical details in high-dimensional data.
  • Empirical studies demonstrate FANC delivers substantial computational, memory, and performance gains in applications like image compression, speech enhancement, and LLM cache optimization.

Frequency-Adaptive Non-uniform Compression (FANC) encompasses a set of algorithmic strategies and neural network modules that adaptively allocate coding resources across frequency domains, spatial regions, or temporal positions within high-dimensional data. FANC leverages explicit or learned decompositions—typically in the Fourier, DCT, or wavelet bases—and provides non-uniform, content-adaptive bit allocation that maximizes efficiency for sources exhibiting spectral or spatial inhomogeneity. By exploiting both the statistical and perceptual variability across frequency components, FANC achieves substantial improvements in rate-distortion trade-offs, architectural efficiency, and computational/memory savings across diverse domains including image compression, speech enhancement, LLM KV-cache compression, and computational wave physics.

1. Core Principles and Theoretical Foundations

FANC is unified by three primary algorithmic principles:

1. Explicit Frequency Decomposition:

FANC systems utilize spectral transforms (DCT, DWT, FFT) or learned proxies (e.g., frequency masks from error-variance maps) to decompose data into interpretable frequency bands or subbands (Pan et al., 25 Nov 2025, Choi et al., 2023, Rhee et al., 2021).

  1. Adaptive Bit Allocation: Allocation of coding or modeling resources is performed non-uniformly across frequency components, spatial regions, or time—driven by learnable weights, ablation-derived masks, or statistical heuristics. Content-adaptive strategies are central, enabling robust handling of local signal complexity and information density (Pan et al., 25 Nov 2025, Li et al., 26 Jul 2025, Xue et al., 31 Jan 2026).
  2. Band- or Mask-driven Encoding/Decoding: Bits are spent on frequencies/regions according to importance, as inferred from learned error maps, DCT-magnitude statistics, message-passing marginals (LDGM codes), or information-theoretic ablation studies (Cappellari, 2010, Li et al., 26 Jul 2025, Rhee et al., 2021).

These principles enable FANC frameworks to efficiently represent both low- and high-frequency content, realizing performance unattainable by uniform, non-adaptive coding.

2. Algorithmic Realizations Across Domains

a) Learned Image Compression

In neural image codecs, FANC is instantiated by modules such as the Adaptive Frequency Decomposition (AFD) (Rhee et al., 2021), which predicts an error-variance map σ(x)\sigma(x) per pixel and channel:

(x^,  σ)=AFD(x,z)(\hat{x},\;\sigma) = \mathrm{AFD}(x,z)

mL(i)={1,σ(i)τ 0,σ(i)>τm_L(i) = \begin{cases} 1,& \sigma(i)\le\tau \ 0,& \sigma(i)>\tau \end{cases}

Where mLm_L and mH=1mLm_H = 1 - m_L define low/high-frequency masks; τ\tau is an image- and channel-adaptive threshold. Compression proceeds in a coarse-to-fine pipeline, first encoding low-frequency pixels with a dedicated network, then encoding high-frequency pixels conditioned on the decoded low-frequency content (Rhee et al., 2021).

Advanced architectures such as HCFSSNet further embed FANC by local DCT decomposition and Adaptive Frequency Modulation Modules (AFMM), learning per-frequency weights Wf(u,v)W_f(u,v) via CNNs, modulating latent representations before entropy coding. This facilitates direct learning of optimal bit allocation schemes:

X~(u,v)=Wf(u,v)X(u,v)\widetilde X(u,v) = W_f(u,v) \cdot X(u,v)

(Pan et al., 25 Nov 2025) demonstrates bidirectional fusion of spatial (state-space/VONSS) and frequency (AFMM/DCT) cues, enhancing both long-range and fine-detail compression.

b) Speech and Audio Spectrograms

FANC encoders in speech enhancement, as in DVPD (Xue et al., 31 Jan 2026), partition the spectrogram along frequency bands, applying non-uniform compression:

  • Low ([0, FF_\ell], e.g., 0–2 kHz): no downsampling, preserving harmonics.
  • Mid (F<fFmF_\ell < f \leq F_m, e.g., 2–4 kHz): moderate compression (stride=2, medium dilation).
  • High (Fm<fFF_m < f \leq F, e.g., >4 kHz): aggressive compression (stride=4, large dilation).

The mapping is: C(f,t)=B(f)Conv3,1,1(S)+Bm(f)Conv3,3,2(S)+Bh(f)Conv3,5,4(S)C(f,t) = B_\ell(f)\cdot \mathrm{Conv}_{3,1,1}(S) + B_m(f)\cdot \mathrm{Conv}_{3,3,2}(S) + B_h(f)\cdot \mathrm{Conv}_{3,5,4}(S)

Such band-splitting aligns with psychoacoustic relevance and empirical sparsity, yielding both efficiency and preservation of perceptual quality (Xue et al., 31 Jan 2026).

c) LLM KV Cache Compression

In KV cache applications, FAEDKV (Li et al., 26 Jul 2025) employs an Infinite-Window Discrete Fourier Transform (IWDFT) to transform the cache into the frequency domain. Layer-specific spectral band selection is performed via ablation studies:

Kf,t+1,[k]=Wk(N1NKf,t,[k]+1NktR)K^{f,t+1,\ell}[k]=W_k\left(\frac{N-1}{N}K^{f,t,\ell}[k]+\frac{1}{N}\,k_{t-R}^\ell\right)

Masking is applied to prune uninformative bins (mask M()(k)M^{(\ell)}(k) derived from ablation scores), achieving up to 4× prefill speedups and maintaining accuracy at up to 10× compression (Li et al., 26 Jul 2025).

d) Wave Physics Inverse Problems

For full-waveform inversion, FANC combines frequency-adaptive grid selection with cascaded lossy compression (temporal/spatial downsampling, thresholding in spatial, wavelet, or wave-atom domains). Discretization is frequency-band scheduled, with each band modeled at spatial/temporal resolutions just sufficient for its spectral content, and error-controlled compression ensuring the optimization remains stable (Protopapa et al., 2021).

3. Rate-Distortion, Training, and Optimization

The FANC training regime typically minimizes a rate-distortion Lagrangian:

L=R+λD\mathcal{L} = R + \lambda D

with RR the estimated code length (cross-entropy or entropy bottlenecks), DD a distortion function (MSE, MS-SSIM), and λ\lambda the rate-distortion tradeoff parameter. For coarse-to-fine or multi-band FANC, separate mask-weighted reconstruction and rate terms are assigned per frequency or region (Rhee et al., 2021, Choi et al., 2023).

For LDGM-based lossy FANC, a belief-propagation plus decimation encoder achieves near-theoretic efficiency on non-uniform Bernoulli sources by carefully matching quantizer parameters and graph degrees to the target bias and distortion (Cappellari, 2010).

Optimization for complexity is domain-driven: convolutional FANC reduces parameters/MACs in speech models (Xue et al., 31 Jan 2026); FAEDKV achieves O(NlogN)O(N\,\log N) prefill and O(rNlogN)O(r\,N\,\log N) per-step reconstruction (Li et al., 26 Jul 2025); FWI compression yields 30% runtime reduction and 10310^3104×10^4\times memory savings with <0.2%<0.2\% error penalty (Protopapa et al., 2021).

4. Empirical Performance and Ablation Evidence

Extensive experiments across domains document the efficiency of FANC:

  • Lossless image compression: State-of-the-art on DIV2K, CLIC.p/m, with FANC outperforming prior learned and hand-engineered codecs by 1–7% in bpp. Dense ablations show losses of up to 11% in high-frequency bpp when the coarse-to-fine scheme is ablated (Rhee et al., 2021).
  • Learned lossy image compression with scalability: Overhead for quality-scalable FANC is just 1–2%, much lower than classical approaches (15–25%), and supports region-of-interest (ROI) enhancement at minimal incremental cost (Choi et al., 2023).
  • KV cache compression in LLMs: FAEDKV surpasses token eviction and learned vector projection approaches by up to 22% on LongBench, and maintains flat retrieval accuracy across position on Needle-In-A-Haystack, unlike convolutional approaches (Li et al., 26 Jul 2025).
  • Speech enhancement: FANC enables extreme architectural efficiency (1.9M params, 10.2G MACs) with PESQ improvements over reference models at under half the parameter count (Xue et al., 31 Jan 2026). Ablations confirm 0.06–0.07 absolute metric drops when FANC is disabled.
  • Full-waveform inversion: Consistent ~30% runtime reduction and 10310^3104×10^4\times memory savings; compressed gradients yield θ<40\langle \theta \rangle < 40^\circ and admit stable optimization (Protopapa et al., 2021).

5. Practical Deployment and Extensibility

FANC’s modularity allows instantiations in:

Potential extensions include joint spatial-temporal FANC for video, learned adapters for online frequency mask tuning, direct application to attention map compression, or integration with quantization (Li et al., 26 Jul 2025, Rhee et al., 2021).

Limitations include the need to transmit additional mask/threshold data (e.g., τ\tau per subimage), and in some cases issues with differentiable rate control due to hard masking (Rhee et al., 2021).

6. Comparative Overview and Representative Methods

Domain Key FANC Mechanism Experimental Gains
Image Compression Error-variance maps, DCT/AFMM 1–7% bpp savings, 11% high-freq bpp cut, 18–25% BD-rate on SOTA (Rhee et al., 2021, Pan et al., 25 Nov 2025)
Speech Enhancement Tri-band convolutional partition 40–60% MACs reduction, 0.06–0.07 metric gain (Xue et al., 31 Jan 2026)
KV-Cache (LLM) IWDFT, frequency ablation masks 4× speedup, 22% accuracy gain over token eviction, flat retrieval position profile (Li et al., 26 Jul 2025)
Wave Physics Frequency-adaptive gridding, wavelet/atom thresholding 30% runtime cut, 3–4 orders memory win, <0.2%<0.2\% final error increase (Protopapa et al., 2021)

FANC approaches thus provide domain-specific, quantitatively validated improvements over uniform coding baselines, with ablations attributing the bulk of performance to adaptive, fine-grained frequency allocation.

7. Future Directions

Ongoing research aims to generalize FANC by:

A plausible implication is that FANC-based frameworks may become the unifying abstraction for efficient resource allocation in high-dimensional, structured, and evolving signal representations across diverse computational domains.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Frequency-Adaptive Non-uniform Compression (FANC).