Frequency-Adaptive SAR Despeckling
- The paper introduces a frequency-adaptive framework leveraging Haar wavelet decomposition, neural ODE-driven denoising, and tailored high-frequency modules.
- It employs a divide-and-conquer strategy to separately enhance homogeneous low-frequency regions and heterogeneous high-frequency details.
- Empirical evaluations show improved PSNR/SSIM scores, robust noise suppression, and computational efficiency across synthetic and operational SAR images.
Frequency-adaptive heterogeneous despeckling is an advanced methodology within synthetic aperture radar (SAR) image restoration, explicitly targeting the mitigation of signal-dependent speckle noise by exploiting frequency-domain separation and tailored processing for distinct spatial statistics. This paradigm recognizes the fundamentally heterogeneous nature of speckle distributions across spatial regions—homogeneous (low-frequency) zones dominated by gradual intensity transitions, and heterogeneous (high-frequency) zones rich in edge, texture, and point-scatterer content. The SAR-FAH framework embodies this approach, yielding enhanced noise suppression and structural fidelity through specialized modules and a divide-and-conquer architecture.
1. Frequency-Domain Decomposition via Haar Wavelets
The initial stage employs a one-level, two-dimensional separable Haar wavelet transform, precisely defined by four convolution-filter kernels:
$\omega^{LL} =\frac{1}{2}\begin{bmatrix}1&1\1&1\end{bmatrix},\;\; \omega^{LH} =\frac{1}{2}\begin{bmatrix}-1&-1\1&1\end{bmatrix},\;\; \omega^{HL} =\frac{1}{2}\begin{bmatrix}-1&1\-1&1\end{bmatrix},\;\; \omega^{HH} =\frac{1}{2}\begin{bmatrix}1&-1\-1&1\end{bmatrix}$
These filters are convolved with the noisy SAR observation and down-sampled ( along each axis) to produce four frequency sub-bands:
This explicit separation enables targeted despeckling strategies for spatial regions exhibiting distinct speckle behaviors.
2. Low-Frequency Denoising: Neural ODE-Driven LFSP-ODE
The branch (low-frequency, homogeneous) undergoes feature transformation and continuous dynamic modeling via neural ordinary differential equations (ODEs):
Denoising and structural preservation are mathematically expressed as:
where . The ODE is numerically solved using a randomized Euler integrator over steps:
The neural field comprises seven Conv–BN–ReLU blocks interleaved with two DASS (Dynamic Attentive State-Space) modules, which combine CBAM attention mechanisms and lightweight VMamba state-space processing to fuse local and global features. The final state yields the denoised low-frequency feature .
3. High-Frequency Denoising and Enhancement (HFDE Modules)
Each high-frequency wavelet sub-band () is mapped to feature tensors and processed by three individualized HFDE modules, each with the following architecture:
- Encoder: Two stages of Conv–BN–ReLU–MaxPool, reducing spatial size and increasing representation depth.
- Bottleneck: One Conv–BN–ReLU and one DeformableConv–BN–ReLU, with an embedded DASS block for context fusion.
- Decoder: Two upsampling (bilinear or transposed) stages, each with preceding DeformableConvs, plus dense skip connections linking encoder stages to their respective decoder scales.
- Output: Two concluding Conv–BN–ReLU layers producing the denoised high-frequency features.
Deformable convolution is defined as
where offset field is dynamically learned, enabling the receptive field to adapt around edges/textures for optimal speckle suppression and feature preservation.
4. Feature Fusion, Reconstruction, and Optimization
Post-denoising, all four features are concatenated (producing $4C$ channels), fused via a convolution and residual block, and reconstructed via inverse Haar transform:
Training is supervised, optimizing the loss:
No additional frequency-specific weighting or explicit regularizing statistics are incorporated, albeit DASS modules exert mild regularizing influence.
5. Training Protocol and Implementation Details
SAR-FAH is trained on synthetic SAR data using the UC Merced Land-Use (UCL) dataset, and on texture datasets DTD (47 classes) and FMD (10 classes). Key parameters are:
| Parameter | Setting | Comments |
|---|---|---|
| Image Preprocessing | Center crop/resize to | Standard across datasets |
| Patching | Cut to | Augment to patches |
| Optimizer | Adam, lr=, cosine annealing | lr |
| Batch Size | 16 | |
| Epochs | 20 | |
| Feature Channels | 128 | |
| NODE Solver | torchdiffeq, , Euler steps | Implementation detail |
| Hardware | PyTorch 2.2.2, NVIDIA RTX 3090 |
6. Empirical Evaluation and Ablations
Quantitative metrics on synthetic SAR (UCL), texture datasets, and real SAR imagery underscore SAR-FAH's performance:
- Synthetic SAR (UCL, ): PSNR = 23.61 dB, SSIM = 0.6260 (vs. 23.37 dB, 0.6076 for SAR-Trans).
- Synthetic SAR (): PSNR = 28.41 dB (vs. 28.10 for SAR-CAM).
- Texture benchmarks: Highest recorded PSNR/SSIM.
- No-reference metrics (GaoFen-1 L1): ENL = 1213.8, MoR ; SAR-FAH ranks first or second on all metrics.
Ablation studies demonstrate:
| Component | PSNR/SSIM Change | Interpretation |
|---|---|---|
| Remove DASS | Down to 23.40/0.6046 | DASS is crucial for local/global feature fusion |
| Remove NODE | Down to 23.18/0.5966 | NODE yields superior performance for low-frequency despeckling |
| Deformable Conv (dec) | Down to 23.52 | Asymmetric placement is most effective |
| Shared HFDE | Down to 23.44 | Unshared HFDE enables band-specialized denoising |
| HFDE for LF | PSNR 24.41, artifacts | LFSP-ODE is essential for artifact-free low-frequency restoration |
7. Computational Efficiency and Scalability
With ODE steps, SAR-FAH comprises 7.32 million parameters and achieves seconds inference per image. Adjusting enables trade-offs between accuracy and inference speed:
- : PSNR = 23.42 dB, seconds
- SAR-FAH consistently matches or outperforms competitors (e.g., SAR-Trans, HTNet) with fewer parameters and competitive GPU throughput
A plausible implication is that frequency-adaptive architectures can achieve high despeckling quality with faster inference and lower model complexity than monolithic networks, particularly when leveraging specialized submodules tuned for heterogeneous spatial statistics.
Frequency-adaptive heterogeneous despeckling, as instantiated in SAR-FAH, establishes a principled, empirically validated framework for SAR restoration, leveraging wavelet-domain decomposition and module specialization to circumvent the limitations of single-branch deep learning approaches. This technique achieves state-of-the-art speckle suppression and structural preservation across both synthetic and operational SAR imagery by dynamically aligning model capacity and receptive fields with the statistical heterogeneity of natural scenes (Ma et al., 8 Nov 2025).