Papers
Topics
Authors
Recent
Search
2000 character limit reached

Illumination Guided Modulation Block

Updated 3 February 2026
  • IGM Block is a spatially adaptive modulation unit that dynamically gates feature activations using per-pixel illumination maps to improve restoration in underexposed regions.
  • It integrates self-attention and guided-attention mechanisms to refine features and preserve details in well-lit areas within a dual-stream architecture.
  • Empirical studies show that increasing block depth enhances metrics like PSNR and SSIM, highlighting the modular design’s impact on low-light super-resolution performance.

The Illumination Guided Modulation Block (IGM Block) is a spatially adaptive feature modulation unit designed to couple explicit illumination priors with data-driven attention for image enhancement under severe illumination degradations. Introduced in the context of low-light image super-resolution as part of the Guided Texture and Feature Modulation Network (GTFMN), the IGM Block dynamically gates and refines feature activations based on a dense per-pixel illumination map, achieving targeted intensification in underexposed regions and detail preservation in well-lit areas. This architecture provides a modular and mathematically precise solution for joint illumination enhancement and super-resolution, validated with state-of-the-art quantitative and qualitative performance on established benchmarks (Huang et al., 27 Jan 2026).

1. Architectural Context and Motivation

The IGM Block resides within a dual-stream framework comprising an Illumination Stream and a Texture Stream. The Illumination Stream predicts a spatially varying illumination map M∈[0,1]H×W×1\mathbf{M} \in [0,1]^{H \times W \times 1} from a low-light input ILR∈RH×W×3I_{\mathrm{LR}} \in \mathbb{R}^{H \times W \times 3}, leveraging a structure decoder (producing Mspatial\mathbf{M}_{\mathrm{spatial}}) and a global brightness predictor (gg). The Texture Stream consists of a deep cascade of NN IGM Blocks, each modulating the current feature tensor Fi−1\mathbf{F}_{i-1} with the guidance of M\mathbf{M}:

Fi=Bi(Fi−1,M).\mathbf{F}_i = \mathcal{B}_i(\mathbf{F}_{i-1}, \mathbf{M}).

The central objective is to realize spatially adaptive restoration, especially benefitting images with highly nonuniform illumination.

2. Illumination Map Generation and Normalization

The Illumination Stream produces two outputs:

  • A per-pixel spatial map: Mspatial∈[0,1]H×W×1\mathbf{M}_{\mathrm{spatial}} \in [0,1]^{H \times W \times 1}
  • A global scalar brightness: g∈[0,1]g \in [0,1]

The illumination map is normalized as:

M=clamp(0,MspatialE[Mspatial]+ε×g,1),\mathbf{M} = \mathrm{clamp}\left(0, \frac{\mathbf{M}_{\mathrm{spatial}}}{\mathbb{E}[\mathbf{M}_{\mathrm{spatial}}] + \varepsilon} \times g, 1\right),

where E[⋅]\mathbb{E}[\cdot] denotes the spatial mean over H×WH \times W and ε=10−6\varepsilon = 10^{-6}, ensuring stable scale adaptation and numerical stability. This map steers the downstream modulation process.

3. IGM Block: Mathematical Formulation and Layerwise Design

Let Fin∈RH×W×C\mathbf{F}_{\mathrm{in}} \in \mathbb{R}^{H \times W \times C} denote the incoming features; M∈RH×W×1\mathbf{M} \in \mathbb{R}^{H \times W \times 1} is the illumination map. The IGM Block processes these via:

Attention Maps

  • Self-attention: Computed by a multi-scale attention (MSAttn) layer, which comprises parallel convolutions (kernel sizes 1×1, 3×3, 5×5), ReLU, and a sigmoid to yield

Aself=MSAttn(Fin)∈RH×W×C\mathbf{A}_{\mathrm{self}} = \mathrm{MSAttn}(\mathbf{F}_{\mathrm{in}}) \in \mathbb{R}^{H \times W \times C}

  • Guided-attention: Processed via a two-layer adapter from M\mathbf{M}:

Aguide=σ(Conv1×1(ϕ(Conv1×1(M))))∈RH×W×C,\mathbf{A}_{\mathrm{guide}} = \sigma\left(\mathrm{Conv}_{1\times1}(\phi(\mathrm{Conv}_{1\times1}(\mathbf{M})))\right) \in \mathbb{R}^{H \times W \times C},

where ϕ=ReLU\phi = \mathrm{ReLU}, σ\sigma is sigmoid, and M\mathbf{M} is broadcast to CC channels.

The two attention maps are fused additively:

Afinal=Aself+Aguide.\mathbf{A}_{\mathrm{final}} = \mathbf{A}_{\mathrm{self}} + \mathbf{A}_{\mathrm{guide}}.

Feature Gating and Update

After channelwise normalization (e.g., LayerNorm or BN), features are multiplicatively modulated:

F^=Norm(Fin),Fmod=F^⊙Afinal.\mathbf{\hat{F}} = \mathrm{Norm}(\mathbf{F}_{\mathrm{in}}), \quad \mathbf{F}_{\mathrm{mod}} = \mathbf{\hat{F}} \odot \mathbf{A}_{\mathrm{final}}.

A two-layer feed-forward network (FFN; Conv 1×1 → ReLU → Conv 1×1) refines Fmod\mathbf{F}_{\mathrm{mod}}, and a residual skip connection yields:

Fout=Fin+FFN(Fmod).\mathbf{F}_{\mathrm{out}} = \mathbf{F}_{\mathrm{in}} + \mathrm{FFN}(\mathbf{F}_{\mathrm{mod}}).

Layerwise Structure

Branch Layer Sequence Output Shape
Self-attention Conv 1×1 → ReLU → Conv 3×3 (multi-scale) → Sigmoid H×W×CH \times W \times C
Guided-attention Conv 1×1 → ReLU → Conv 1×1 → Sigmoid (adapter on M\mathbf{M}) H×W×CH \times W \times C
Fusion & Gating Add, normalize, element-wise multiply H×W×CH \times W \times C
FFN + Residual Conv 1×1 → ReLU → Conv 1×1; add skip connection H×W×CH \times W \times C

Typically, C=64C=64 and N=64N=64 blocks are used.

4. Spatial Adaptivity and Feature Dynamics

The per-pixel variation in M\mathbf{M} results in spatially adaptive guided attention Aguide\mathbf{A}_{\mathrm{guide}}, which, upon fusion with data-driven Aself\mathbf{A}_{\mathrm{self}}, enables the block to amplify features where required. Specifically, in regions with low predicted illumination, the gating value increases, intensifying the enhancement. In well-illuminated regions, the effect is attenuated, preserving detail and avoiding overenhancement. This targeted behavior is a direct consequence of the per-pixel guidance and normalization protocol (Huang et al., 27 Jan 2026).

5. Empirical Evidence and Ablation Findings

Ablation studies have demonstrated the criticality of both the guidance mechanism and block depth:

  • Block Depth: Increasing the number of IGM Blocks from N=16N=16 to N=64N=64 in the Texture Stream increases PSNR on OmniNormal5 from $38.005$ dB to $38.106$ dB and SSIM from $0.9820$ to $0.9824$, indicating monotonic quality improvements with depth.
  • Guidance Removal: Eliminating the illumination guidance (reducing the block to a plain residual block) diminishes spatial adaptivity and final image quality, e.g., at 4×4\times super-resolution on OmniNormal15, PSNR shifts from $30.3$ dB (with guidance) to $30.2$ dB (without), and SSIM from $0.919$ to $0.916$.

A plausible implication is that illumination-guided attention is essential for localization of enhancement and for maximizing both quantitative and perceptual quality, especially in non-uniform illumination scenarios (Huang et al., 27 Jan 2026).

6. Implementation Details and Training Protocol

Key implementation parameters are as follows:

  • IGM Block count: N=64N=64
  • Channels: C=64C=64 (initial input 3→643\rightarrow 64)
  • Optimizer: Adam, learning rate 2×10−42 \times 10^{-4}
  • Loss: â„“1\ell_1-loss on the YY-channel between super-resolved output and ground-truth HR
  • Batch size: $16$
  • Hardware: Trained on RTX A6000
  • Parameter count: ≈8.78\approx 8.78M (Texture Stream, Illumination Stream, all IGM Blocks)
  • Framework: PyTorch with BasicSR for data I/O and PixelShuffle

All architectural and training details are reproducible from the original publication and can be adapted in new architectures exploiting the modularity of the IGM Block (Huang et al., 27 Jan 2026).

7. Modularity and Generalization

The IGM Block’s structural decomposition—comprising self-attention, guided-attention, additive fusion, channelwise normalization, and lightweight FFN—renders it modular. This division enables straightforward replacement of attention modules or integration into alternative architectures. The design supports further augmentation for other spatially adaptive restoration tasks beyond low-light super-resolution, as the guidance interface is agnostic to the specific prior used (illumination or otherwise) (Huang et al., 27 Jan 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Illumination Guided Modulation Block (IGM Block).