Papers
Topics
Authors
Recent
Search
2000 character limit reached

Frequency-Adaptive SAR Despeckling

Updated 15 November 2025
  • The paper introduces a frequency-adaptive framework leveraging Haar wavelet decomposition, neural ODE-driven denoising, and tailored high-frequency modules.
  • It employs a divide-and-conquer strategy to separately enhance homogeneous low-frequency regions and heterogeneous high-frequency details.
  • Empirical evaluations show improved PSNR/SSIM scores, robust noise suppression, and computational efficiency across synthetic and operational SAR images.

Frequency-adaptive heterogeneous despeckling is an advanced methodology within synthetic aperture radar (SAR) image restoration, explicitly targeting the mitigation of signal-dependent speckle noise by exploiting frequency-domain separation and tailored processing for distinct spatial statistics. This paradigm recognizes the fundamentally heterogeneous nature of speckle distributions across spatial regions—homogeneous (low-frequency) zones dominated by gradual intensity transitions, and heterogeneous (high-frequency) zones rich in edge, texture, and point-scatterer content. The SAR-FAH framework embodies this approach, yielding enhanced noise suppression and structural fidelity through specialized modules and a divide-and-conquer architecture.

1. Frequency-Domain Decomposition via Haar Wavelets

The initial stage employs a one-level, two-dimensional separable Haar wavelet transform, precisely defined by four convolution-filter kernels:

$\omega^{LL} =\frac{1}{2}\begin{bmatrix}1&1\1&1\end{bmatrix},\;\; \omega^{LH} =\frac{1}{2}\begin{bmatrix}-1&-1\1&1\end{bmatrix},\;\; \omega^{HL} =\frac{1}{2}\begin{bmatrix}-1&1\-1&1\end{bmatrix},\;\; \omega^{HH} =\frac{1}{2}\begin{bmatrix}1&-1\-1&1\end{bmatrix}$

These filters are convolved with the noisy SAR observation Xn∈RH×W\mathbf{X}^\mathrm{n}\in\mathbb{R}^{H\times W} and down-sampled (×2\times 2 along each axis) to produce four frequency sub-bands:

XnLL=ωLL∗Xn(Low-frequency, homogeneous) XnLH=ωLH∗Xn(High-frequency, vertical detail) XnHL=ωHL∗Xn(High-frequency, horizontal detail) XnHH=ωHH∗Xn(High-frequency, diagonal detail) \begin{aligned} &\mathbf{X}^\mathrm{nLL}=\omega^{LL}\ast \mathbf{X}^\mathrm{n} &&\text{(Low-frequency, homogeneous)} \ &\mathbf{X}^\mathrm{nLH}=\omega^{LH}\ast \mathbf{X}^\mathrm{n} &&\text{(High-frequency, vertical detail)} \ &\mathbf{X}^\mathrm{nHL}=\omega^{HL}\ast \mathbf{X}^\mathrm{n} &&\text{(High-frequency, horizontal detail)} \ &\mathbf{X}^\mathrm{nHH}=\omega^{HH}\ast \mathbf{X}^\mathrm{n} &&\text{(High-frequency, diagonal detail)} \ \end{aligned}

This explicit separation enables targeted despeckling strategies for spatial regions exhibiting distinct speckle behaviors.

2. Low-Frequency Denoising: Neural ODE-Driven LFSP-ODE

The XnLL\mathbf{X}^\mathrm{nLL} branch (low-frequency, homogeneous) undergoes feature transformation and continuous dynamic modeling via neural ordinary differential equations (ODEs):

FnLL=Conv-BN-ReLU(XnLL)\mathbf{F}^\mathrm{nLL}=\text{Conv-BN-ReLU}(\mathbf{X}^\mathrm{nLL})

Denoising and structural preservation are mathematically expressed as:

du(t)dt=fθLFSP(u(t),t),    u(0)=FnLL\frac{d\mathbf{u}(t)}{dt} = f_\theta^\mathrm{LFSP}(\mathbf{u}(t), t),\;\; \mathbf{u}(0) = \mathbf{F}^\mathrm{nLL}

where t∈[0,1]t\in[0,1]. The ODE is numerically solved using a randomized Euler integrator over N=4N=4 steps:

u(ti+1)=u(ti)+Δt fθLFSP(u(ti),ti)\mathbf{u}(t_{i+1}) = \mathbf{u}(t_i) + \Delta t\,f_\theta^\mathrm{LFSP}(\mathbf{u}(t_i), t_i)

The neural field fθLFSPf_\theta^\mathrm{LFSP} comprises seven Conv–BN–ReLU blocks interleaved with two DASS (Dynamic Attentive State-Space) modules, which combine CBAM attention mechanisms and lightweight VMamba state-space processing to fuse local and global features. The final state u(T)\mathbf{u}(T) yields the denoised low-frequency feature FLL\mathbf{F}^\mathrm{LL}.

3. High-Frequency Denoising and Enhancement (HFDE Modules)

Each high-frequency wavelet sub-band (XnLH,XnHL,XnHH\mathbf{X}^{\mathrm{nLH}}, \mathbf{X}^{\mathrm{nHL}}, \mathbf{X}^{\mathrm{nHH}}) is mapped to feature tensors and processed by three individualized HFDE modules, each with the following architecture:

  • Encoder: Two stages of Conv–BN–ReLU–MaxPool2×2_{2\times2}, reducing spatial size and increasing representation depth.
  • Bottleneck: One Conv–BN–ReLU and one DeformableConv–BN–ReLU, with an embedded DASS block for context fusion.
  • Decoder: Two upsampling (bilinear or transposed) stages, each with preceding DeformableConvs, plus dense skip connections linking encoder stages to their respective decoder scales.
  • Output: Two concluding Conv–BN–ReLU layers producing the denoised high-frequency features.

Deformable convolution is defined as

y(p0)=∑p∈Rw(p) x(p0+p+Δp)y(p_0) = \sum_{p\in\mathcal{R}} w(p)\,\mathbf{x}(p_0 + p + \Delta p)

where offset field Δp\Delta p is dynamically learned, enabling the receptive field to adapt around edges/textures for optimal speckle suppression and feature preservation.

4. Feature Fusion, Reconstruction, and Optimization

Post-denoising, all four features {FLL,FLH,FHL,FHH}\{\mathbf{F}^{\mathrm{LL}}, \mathbf{F}^{\mathrm{LH}}, \mathbf{F}^{\mathrm{HL}}, \mathbf{F}^{\mathrm{HH}}\} are concatenated (producing $4C$ channels), fused via a 1×11\times1 convolution and residual block, and reconstructed via inverse Haar transform:

Xout=InverseHaar(F)\mathbf{X}^{\mathrm{out}} = \text{InverseHaar}(\mathbf{F})

Training is supervised, optimizing the L1L_1 loss:

L(Θ)=1n∑i=1n∥  fSAR-FAH(Xin;Θ)−Xi  ∥1\mathcal{L}(\Theta) = \frac{1}{n} \sum_{i=1}^n \|\; f_{\text{SAR-FAH}}(\mathbf{X}^\mathrm{n}_i; \Theta) - \mathbf{X}_i \; \|_1

No additional frequency-specific weighting or explicit regularizing statistics are incorporated, albeit DASS modules exert mild regularizing influence.

5. Training Protocol and Implementation Details

SAR-FAH is trained on synthetic SAR data using the UC Merced Land-Use (UCL) dataset, and on texture datasets DTD (47 classes) and FMD (10 classes). Key parameters are:

Parameter Setting Comments
Image Preprocessing Center crop/resize to 256×256256 \times 256 Standard across datasets
Patching Cut to 128×128128 \times 128 Augment to ∼29 400\sim29\,400 patches
Optimizer Adam, lr=1×10−31\times10^{-3}, cosine annealing lr→1×10−6\to1\times10^{-6}
Batch Size 16
Epochs 20
Feature Channels CC 128
NODE Solver torchdiffeq, T=1T=1, N=4N=4 Euler steps Implementation detail
Hardware PyTorch 2.2.2, NVIDIA RTX 3090

6. Empirical Evaluation and Ablations

Quantitative metrics on synthetic SAR (UCL), texture datasets, and real SAR imagery underscore SAR-FAH's performance:

  • Synthetic SAR (UCL, L=1L=1): PSNR = 23.61 dB, SSIM = 0.6260 (vs. 23.37 dB, 0.6076 for SAR-Trans).
  • Synthetic SAR (L=10L=10): PSNR = 28.41 dB (vs. 28.10 for SAR-CAM).
  • Texture benchmarks: Highest recorded PSNR/SSIM.
  • No-reference metrics (GaoFen-1 L1): ENL = 1213.8, MoR ≈0.99\approx 0.99; SAR-FAH ranks first or second on all metrics.

Ablation studies demonstrate:

Component PSNR/SSIM Change Interpretation
Remove DASS Down to 23.40/0.6046 DASS is crucial for local/global feature fusion
Remove NODE Down to 23.18/0.5966 NODE yields superior performance for low-frequency despeckling
Deformable Conv (dec) Down to 23.52 Asymmetric placement is most effective
Shared HFDE Down to 23.44 Unshared HFDE enables band-specialized denoising
HFDE for LF PSNR 24.41, artifacts LFSP-ODE is essential for artifact-free low-frequency restoration

7. Computational Efficiency and Scalability

With N=4N=4 ODE steps, SAR-FAH comprises 7.32 million parameters and achieves ∼3.1\sim3.1 seconds inference per 256×256256 \times 256 image. Adjusting NN enables trade-offs between accuracy and inference speed:

  • N=1N=1: PSNR = 23.42 dB, ∼1.4\sim1.4 seconds
  • SAR-FAH consistently matches or outperforms competitors (e.g., SAR-Trans, HTNet) with fewer parameters and competitive GPU throughput

A plausible implication is that frequency-adaptive architectures can achieve high despeckling quality with faster inference and lower model complexity than monolithic networks, particularly when leveraging specialized submodules tuned for heterogeneous spatial statistics.


Frequency-adaptive heterogeneous despeckling, as instantiated in SAR-FAH, establishes a principled, empirically validated framework for SAR restoration, leveraging wavelet-domain decomposition and module specialization to circumvent the limitations of single-branch deep learning approaches. This technique achieves state-of-the-art speckle suppression and structural preservation across both synthetic and operational SAR imagery by dynamically aligning model capacity and receptive fields with the statistical heterogeneity of natural scenes (Ma et al., 8 Nov 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Frequency-Adaptive Heterogeneous Despeckling.