Frequency-Adaptive SAR Despeckling

Updated 15 November 2025

The paper introduces a frequency-adaptive framework leveraging Haar wavelet decomposition, neural ODE-driven denoising, and tailored high-frequency modules.
It employs a divide-and-conquer strategy to separately enhance homogeneous low-frequency regions and heterogeneous high-frequency details.
Empirical evaluations show improved PSNR/SSIM scores, robust noise suppression, and computational efficiency across synthetic and operational SAR images.

Frequency-adaptive heterogeneous despeckling is an advanced methodology within synthetic aperture radar (SAR) image restoration, explicitly targeting the mitigation of signal-dependent speckle noise by exploiting frequency-domain separation and tailored processing for distinct spatial statistics. This paradigm recognizes the fundamentally heterogeneous nature of speckle distributions across spatial regions—homogeneous (low-frequency) zones dominated by gradual intensity transitions, and heterogeneous (high-frequency) zones rich in edge, texture, and point-scatterer content. The SAR-FAH framework embodies this approach, yielding enhanced noise suppression and structural fidelity through specialized modules and a divide-and-conquer architecture.

1. Frequency-Domain Decomposition via Haar Wavelets

The initial stage employs a one-level, two-dimensional separable Haar wavelet transform, precisely defined by four convolution-filter kernels:

$\omega^{LL} =\frac{1}{2}\begin{bmatrix}1&1\1&1\end{bmatrix},\;\; \omega^{LH} =\frac{1}{2}\begin{bmatrix}-1&-1\1&1\end{bmatrix},\;\; \omega^{HL} =\frac{1}{2}\begin{bmatrix}-1&1\-1&1\end{bmatrix},\;\; \omega^{HH} =\frac{1}{2}\begin{bmatrix}1&-1\-1&1\end{bmatrix}$

These filters are convolved with the noisy SAR observation $\mathbf{X}^\mathrm{n}\in\mathbb{R}^{H\times W}$ and down-sampled ( $\times 2$ along each axis) to produce four frequency sub-bands:

$\begin{aligned} &\mathbf{X}^\mathrm{nLL}=\omega^{LL}\ast \mathbf{X}^\mathrm{n} &&\text{(Low-frequency, homogeneous)} \ &\mathbf{X}^\mathrm{nLH}=\omega^{LH}\ast \mathbf{X}^\mathrm{n} &&\text{(High-frequency, vertical detail)} \ &\mathbf{X}^\mathrm{nHL}=\omega^{HL}\ast \mathbf{X}^\mathrm{n} &&\text{(High-frequency, horizontal detail)} \ &\mathbf{X}^\mathrm{nHH}=\omega^{HH}\ast \mathbf{X}^\mathrm{n} &&\text{(High-frequency, diagonal detail)} \ \end{aligned}$

This explicit separation enables targeted despeckling strategies for spatial regions exhibiting distinct speckle behaviors.

2. Low-Frequency Denoising: Neural ODE-Driven LFSP-ODE

The $\mathbf{X}^\mathrm{nLL}$ branch (low-frequency, homogeneous) undergoes feature transformation and continuous dynamic modeling via neural ordinary differential equations (ODEs):

$\mathbf{F}^\mathrm{nLL}=\text{Conv-BN-ReLU}(\mathbf{X}^\mathrm{nLL})$

Denoising and structural preservation are mathematically expressed as:

$\frac{d\mathbf{u}(t)}{dt} = f_\theta^\mathrm{LFSP}(\mathbf{u}(t), t),\;\; \mathbf{u}(0) = \mathbf{F}^\mathrm{nLL}$

where $t\in[0,1]$ . The ODE is numerically solved using a randomized Euler integrator over $N=4$ steps:

$\mathbf{u}(t_{i+1}) = \mathbf{u}(t_i) + \Delta t\,f_\theta^\mathrm{LFSP}(\mathbf{u}(t_i), t_i)$

The neural field $f_\theta^\mathrm{LFSP}$ comprises seven Conv–BN–ReLU blocks interleaved with two DASS (Dynamic Attentive State-Space) modules, which combine CBAM attention mechanisms and lightweight VMamba state-space processing to fuse local and global features. The final state $\mathbf{u}(T)$ yields the denoised low-frequency feature $\mathbf{F}^\mathrm{LL}$ .

3. High-Frequency Denoising and Enhancement (HFDE Modules)

Each high-frequency wavelet sub-band ( $\mathbf{X}^{\mathrm{nLH}}, \mathbf{X}^{\mathrm{nHL}}, \mathbf{X}^{\mathrm{nHH}}$ ) is mapped to feature tensors and processed by three individualized HFDE modules, each with the following architecture:

Encoder: Two stages of Conv–BN–ReLU–MaxPool $_{2\times2}$ , reducing spatial size and increasing representation depth.
Bottleneck: One Conv–BN–ReLU and one DeformableConv–BN–ReLU, with an embedded DASS block for context fusion.
Decoder: Two upsampling (bilinear or transposed) stages, each with preceding DeformableConvs, plus dense skip connections linking encoder stages to their respective decoder scales.
Output: Two concluding Conv–BN–ReLU layers producing the denoised high-frequency features.

Deformable convolution is defined as

$y(p_0) = \sum_{p\in\mathcal{R}} w(p)\,\mathbf{x}(p_0 + p + \Delta p)$

where offset field $\Delta p$ is dynamically learned, enabling the receptive field to adapt around edges/textures for optimal speckle suppression and feature preservation.

4. Feature Fusion, Reconstruction, and Optimization

Post-denoising, all four features $\{\mathbf{F}^{\mathrm{LL}}, \mathbf{F}^{\mathrm{LH}}, \mathbf{F}^{\mathrm{HL}}, \mathbf{F}^{\mathrm{HH}}\}$ are concatenated (producing $4C$ channels), fused via a $1\times1$ convolution and residual block, and reconstructed via inverse Haar transform:

$\mathbf{X}^{\mathrm{out}} = \text{InverseHaar}(\mathbf{F})$

Training is supervised, optimizing the $L_1$ loss:

$\mathcal{L}(\Theta) = \frac{1}{n} \sum_{i=1}^n \|\; f_{\text{SAR-FAH}}(\mathbf{X}^\mathrm{n}_i; \Theta) - \mathbf{X}_i \; \|_1$

No additional frequency-specific weighting or explicit regularizing statistics are incorporated, albeit DASS modules exert mild regularizing influence.

5. Training Protocol and Implementation Details

SAR-FAH is trained on synthetic SAR data using the UC Merced Land-Use (UCL) dataset, and on texture datasets DTD (47 classes) and FMD (10 classes). Key parameters are:

Parameter	Setting	Comments
Image Preprocessing	Center crop/resize to $256 \times 256$	Standard across datasets
Patching	Cut to $128 \times 128$	Augment to $\sim29\,400$ patches
Optimizer	Adam, lr= $1\times10^{-3}$ , cosine annealing	lr $\to1\times10^{-6}$
Batch Size	16
Epochs	20
Feature Channels $C$	128
NODE Solver	torchdiffeq, $T=1$ , $N=4$ Euler steps	Implementation detail
Hardware	PyTorch 2.2.2, NVIDIA RTX 3090

6. Empirical Evaluation and Ablations

Quantitative metrics on synthetic SAR (UCL), texture datasets, and real SAR imagery underscore SAR-FAH's performance:

Synthetic SAR (UCL, $L=1$ ): PSNR = 23.61 dB, SSIM = 0.6260 (vs. 23.37 dB, 0.6076 for SAR-Trans).
Synthetic SAR ( $L=10$ ): PSNR = 28.41 dB (vs. 28.10 for SAR-CAM).
Texture benchmarks: Highest recorded PSNR/SSIM.
No-reference metrics (GaoFen-1 L1): ENL = 1213.8, MoR $\approx 0.99$ ; SAR-FAH ranks first or second on all metrics.

Ablation studies demonstrate:

Component	PSNR/SSIM Change	Interpretation
Remove DASS	Down to 23.40/0.6046	DASS is crucial for local/global feature fusion
Remove NODE	Down to 23.18/0.5966	NODE yields superior performance for low-frequency despeckling
Deformable Conv (dec)	Down to 23.52	Asymmetric placement is most effective
Shared HFDE	Down to 23.44	Unshared HFDE enables band-specialized denoising
HFDE for LF	PSNR 24.41, artifacts	LFSP-ODE is essential for artifact-free low-frequency restoration

7. Computational Efficiency and Scalability

With $N=4$ ODE steps, SAR-FAH comprises 7.32 million parameters and achieves $\sim3.1$ seconds inference per $256 \times 256$ image. Adjusting $N$ enables trade-offs between accuracy and inference speed:

$N=1$ : PSNR = 23.42 dB, $\sim1.4$ seconds
SAR-FAH consistently matches or outperforms competitors (e.g., SAR-Trans, HTNet) with fewer parameters and competitive GPU throughput

A plausible implication is that frequency-adaptive architectures can achieve high despeckling quality with faster inference and lower model complexity than monolithic networks, particularly when leveraging specialized submodules tuned for heterogeneous spatial statistics.

Frequency-adaptive heterogeneous despeckling, as instantiated in SAR-FAH, establishes a principled, empirically validated framework for SAR restoration, leveraging wavelet-domain decomposition and module specialization to circumvent the limitations of single-branch deep learning approaches. This technique achieves state-of-the-art speckle suppression and structural preservation across both synthetic and operational SAR imagery by dynamically aligning model capacity and receptive fields with the statistical heterogeneity of natural scenes (Ma et al., 8 Nov 2025).

Markdown Report Issue Upgrade to Chat

References (1)

Towards Frequency-Adaptive Learning for SAR Despeckling (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Frequency-Adaptive Heterogeneous Despeckling.