Low-Guided High-Frequency Injection Module
- LGHI module is a neural network component that fuses low-frequency global structures with selectively enhanced high-frequency details using multi-scale decomposition.
- It is integrated into decoding or upsampling pathways with fixed or learnable filters, depthwise convolutions, and gated cross-frequency attention to preserve detail.
- Empirical results in medical imaging, HDR imaging, and financial modeling show improvements in segmentation metrics, PSNR, and ROI, demonstrating its practical impact.
Low-Guided High-Frequency Injection (LGHI) modules are a class of neural network components designed for progressive refinement of representations by fusing low-frequency structural information with selectively injected high-frequency detail. They are primarily motivated by frequency-domain analysis, exploiting the observation that salient global structures are predominantly encoded in low-frequency components, while local details and abrupt transitions are contained in high-frequency bands. LGHI has found applications in diverse domains, including medical image segmentation (Mu et al., 10 Sep 2025), high dynamic range (HDR) imaging (Dai et al., 2021), and financial time series modeling (Li et al., 19 Jan 2026).
1. Fundamental Principles and Architectural Placement
LGHI modules operate by decomposing features into frequency bands—either with fixed wavelets (e.g., Haar) or with learnable FIR filter banks. The core paradigm is “low-guided high-frequency injection”: high-frequency subbands, rich in local or transient details, are selectively and controllably synthesized into the backbone low-frequency representation.
Architecturally, LGHI modules are typically situated within decoding or upsampling pathways. For instance, in SFD-Mamba2Net, LGHI (via the PHFP module) is placed at the head of each decoder block, intervening before the main Conv–BN–ReLU block and dropout regularization (Mu et al., 10 Sep 2025). In FHDRNet, LGHI governs every upsampling stage, operating in parallel on low-frequency fusion and high-frequency subband aggregation (Dai et al., 2021). In WaveLSFormer, LGHI sits after the wavelet front-end, fusing multi-scale features before Transformer encoding (Li et al., 19 Jan 2026). This architectural strategy ensures that fine details are handled in context with the prevailing global structure per decoding resolution or temporal position.
2. Mathematical Formulations and Injection Mechanisms
The mechanism of LGHI is instantiated through frequency-domain decomposition, detail enhancement, and guided re-fusion. The design varies according to task and architecture:
- Multi-level Wavelet Decomposition and Reconstruction: PHFP in SFD-Mamba2Net applies an -level Haar cascade to the decoder input feature , recursively generating low-pass () and high-pass (, , ) sub-bands. High-frequency components at each level are enhanced by depthwise 5×5 convolution (), and injected into the low-frequency stream via staged inverse wavelet transforms:
The output is:
Here, the low-frequency anchor at each level “guides” the injection and reconstruction of high-frequency detail, ensuring that global consistency is preserved (Mu et al., 10 Sep 2025).
- Attention-Based Frequency Fusion and Gating: In frequency-guided upsampling for HDR imaging, LGHI leverages attention masks to gate low-frequency features from multiple sources, while independently fusing high-frequency subbands via learned convolutional aggregation. The upsampled output is produced by inverse Haar DWT using the fused low- and high-frequency maps (Dai et al., 2021).
- Low-Guided Cross-Frequency Attention: In WaveLSFormer, LGHI is implemented as a multi-head cross-attention module: queries and keys are derived from low-frequency streams, while values are from high-frequency streams. The refined high-frequency contribution is fused into the low-frequency backbone via a gated residual connection:
with a trainable scalar initialized negative to limit early-stage injection. This mechanism ensures that high-frequency cues modulate the low-frequency trend only where the trend deems them relevant, suppressing noise (Li et al., 19 Jan 2026).
3. Implementation Strategies and Parameterization
Commonalities across published LGHI variants include:
- Wavelet decomposition: Use of Haar 2D DWT (spatial or temporal), often fixed but sometimes learnable (as in WaveLSFormer).
- Convolutional enhancements: High-frequency channels are refined with depthwise or standard convolution for channelwise specificity.
- Branch aggregation: In multi-source imaging, attention-based gating modulates the contribution of supporting feature maps before fusion (Dai et al., 2021).
- Inverse DWT: Reconstruction leverages IDWT to restore spatial or temporal resolution post-injection.
- Residual and gated fusion: Scalar gates (sigmoid-activated) control the injection intensity of high-frequency content.
- Hyperparameters: Number of wavelet levels (e.g., ), kernel sizes (e.g., 5×5 for PHFP, 3×3 for FHDRNet), and number of attention heads are set according to computational cost and task resolution.
No explicit attention maps are generally learned within the injection path in SFD-Mamba2Net; the frequency-domain fusion coupled with depthwise convolution suffices to steer the injection process (Mu et al., 10 Sep 2025).
4. Training Regimes and Regularization
LGHI-equipped networks are trained end-to-end, relying on task-specific losses:
- SFD-Mamba2Net: Standard mean squared error over the final segmentation map, with dropout and -decay regularization. No explicit frequency-domain or wavelet-based losses are required; multi-level LGHI enables gradient propagation to detail-level features (Mu et al., 10 Sep 2025).
- FHDRNet: The objective includes a log-transformed reconstruction (-law) loss, edge-aware Sobel loss to encourage spatial fidelity, and Adam optimizer with staged learning rate scheduling (Dai et al., 2021).
- WaveLSFormer: Direct task reward (trading returns) and risk-aware regularization, with explicit spectral regularizers applied only to the learnable wavelet filter banks. LGHI itself is stabilized by initializing the injection gate small to avoid gradient explosion (Li et al., 19 Jan 2026).
5. Empirical Impact and Ablation Insights
Ablation studies consistently show substantial benefits of LGHI modules over alternatives such as simple concatenation, average fusion, or single-branch decoders:
| Model / Variant | Segmentation or Task Metric | LGHI Effect | Reference |
|---|---|---|---|
| U-Net (baseline) | Dice: 85.03% | – | (Mu et al., 10 Sep 2025) |
| U-Net + PHFP (LGHI) | Dice: 85.65% | Dice +0.62 pt, HD95 −1.84 px, ASSD −0.29 px | (Mu et al., 10 Sep 2025) |
| FHDRNet (full) | PSNR-μ: 43.91 dB | – | (Dai et al., 2021) |
| FHDRNet w/o LGHI attention | PSNR-μ: 43.47 dB | −0.44 dB | (Dai et al., 2021) |
| WaveLSFormer + LGHI | ROI: 0.607, Sharpe: 2.157 | +193% ROI, +165% Sharpe vs concat/linear fusion | (Li et al., 19 Jan 2026) |
| WaveLSFormer w/ concat fusion | ROI: 0.207, Sharpe: 0.814 | Baseline | (Li et al., 19 Jan 2026) |
PHFP in SFD-Mamba2Net yields marked improvements in both Dice overlap and boundary metrics (HD95, ASSD) for ICA vessel segmentation; Grad-CAM activations confirm superior fine-branch and edge localization with LGHI. In HDR imaging, learned low-frequency gating and injected high-frequency fusion boost PSNR and visual sharpness. In risk-optimized equity trading, LGHI leads to much higher and more stable returns.
6. Cross-Domain Variants and Theoretical Rationale
Although architectural details differ, LGHI modules share core theoretical motivations:
- Low-frequency guidance: Injection is modulated by global structure, suppressing spurious noise and preserving semantic context.
- Selective high-frequency enhancement: Fine details are adaptively synthesized where warranted, rather than globally or indiscriminately.
- Gradient stability: Gated injection prevents adverse effects such as exploding Jacobians or loss of training signal in deep stacks (Li et al., 19 Jan 2026).
- Multi-source or multi-scale context: In multimodal input settings, attention or gating mechanisms within LGHI enable selective aggregation of auxiliary information.
A plausible implication is that LGHI-style modules can generalize to other domains requiring high-fidelity detail preservation without loss of structural integrity, provided that meaningful signal decomposition can be achieved.
7. Representative Implementations
Below are brief schematic representations of LGHI variants:
- PHFP (SFD-Mamba2Net): Multi-level 2D wavelet split → depthwise convolution enhancement of high-pass bands → recursive IWT-based fusion with low-pass → sum with backbone feature (Mu et al., 10 Sep 2025).
- LGHI (FHDRNet): Parallel attention-based low-frequency gating, learned high-frequency fusion, re-injection via IDWT at each upsampling scale (Dai et al., 2021).
- LGHI (WaveLSFormer): Multi-head attention where queries and keys from low-frequency stream attend to high-frequency values, fused via a small scalar gate into the state for downstream Transformer processing (Li et al., 19 Jan 2026).
These implementations demonstrate LGHI’s adaptability in spatial, spatiotemporal, and sequential feature hierarchies, consistently yielding higher fidelity and stability in fine-grained tasks.