IlluminateNet: Unsupervised Underwater Enhancer

Updated 6 February 2026

IlluminateNet is a fully unsupervised CNN module that enhances underwater images by adaptively correcting luminance and restoring color balance.
It utilizes a dual-stream architecture with a channel-wise atmospheric-light estimator and transmission map estimation for global and local corrections.
Empirical results show significant improvements in UCIQE, UIQM, and keypoint repeatability, boosting performance in robotic vision and underwater imaging tasks.

IlluminateNet is a fully unsupervised convolutional neural network (CNN) module designed for adaptive luminance enhancement in underwater images. Developed within the DIVER (Domain-Invariant Visual Enhancement and Restoration) framework, IlluminateNet aims to recover achromatic, brightness-balanced renderings of severely degraded raw underwater photographs. It achieves domain-invariant correction of illumination and color loss caused by wavelength-dependent attenuation, scattering, and illumination non-uniformity encountered in diverse aquatic environments, including shallow, deep, and turbid scenes. IlluminateNet is specifically invoked for low-light scenarios, delivering substantial improvements in both perceptual quality and downstream robotic vision metrics (Makam et al., 30 Jan 2026).

1. Architectural Structure and Processing Workflow

IlluminateNet operates on raw underwater RGB images $U_R\in\mathbb R^{H\times W\times 3}$ and produces an illumination-corrected output $U_I\in\mathbb R^{H\times W\times 3}$ . The architecture consists of the following primary components:

Channel-wise Atmospheric-Light Estimator: Each color channel ( $R$ , $G$ , $B$ ) of $U_R$ is processed by a small per-channel CNN consisting of $3\times3$ convolutional layers and ReLU nonlinearities, generating feature maps $\phi_c(x)$ . These are concatenated and passed through an element-wise $\tanh$ nonlinearity to compute a global atmospheric light estimate:

$U_G(x) = \tanh\big(\phi_R(x)\,\Vert\,\phi_G(x)\,\Vert\,\phi_B(x)\big).$

The $\tanh$ activation constrains output to $(-1,1)$ per channel.

Transmission Map Estimation (Hybrid Rule + CNN): A patch-max difference is computed relative to an “ambient light” statistic $A$ , defined as the mean of the top-0.1% farthest-depth pixels according to a learned depth model:

$T(x) = \max_{y : \|y-x\|\le r} \frac{|U_R(y) - A|}{\max(A, 1-A)}.$

This is further refined by a $3\times3$ CNN layer with ReLU to yield a smoothed transmission map $\tilde T(x)$ .

Luminance Residual Computation: The transmission-guided luminance residual is computed pointwise as

$U_T(x) = \max \left(0, \frac{U_R(x) - U_G(x)}{\tilde T(x)} \right).$

Fusion and Skip Connection: The final illumination-corrected image is obtained via additive fusion (“residual skip”) of $U_G$ and $U_T$ :

$U_I(x) = U_G(x) + U_T(x).$

ReLU activations are employed throughout all convolutional layers, with a $\tanh$ nonlinearity exclusively at the atmospheric-light stream output.

This architecture decouples the estimation of global illumination (via $U_G$ ) from local, transmission-guided correction (via $U_T$ ), supporting compensation for both spatially global and locally variant degradations.

2. Mathematical Model and Image Formation

IlluminateNet’s formulation is based on a simplified underwater image formation prior derived in DIVER:

$U_R(x) = U_I(x)\,T(x) + U_G(x)[1-T(x)].$

Solving for $U_I$ :

$U_I(x) = U_G(x) + \frac{U_R(x) - U_G(x)}{T(x)}.$

A patch-based maximum transmission map is estimated:

$T(x) = \max_{y \in \psi(x)} \frac{|U_R(y) - A|}{\max(A, 1-A)},$

where $\psi(x)$ is the neighborhood of pixel $x$ . The global-light estimate is produced by

$U_G(x) = \tanh(\mathrm{ReLU}({U_R^R})\Vert \mathrm{ReLU}({U_R^G})\Vert \mathrm{ReLU}({U_R^B})).$

Residual smoothing is performed as:

$U_T(x) = \mathrm{ReLU}\left(\frac{U_R(x)-U_G(x)}{\tilde T(x)}\right).$

This model provides an interpretable mechanism for disentangling illumination correction from color channel balancing, and it enforces physical plausibility in the correction process.

3. Loss Functions and Unsupervised Training Paradigm

IlluminateNet leverages only unpaired underwater images for training, relying on unsupervised objectives that avoid dependence on reference clean ground truth. Two complementary losses are used:

Gray-World Loss $\mathcal L_G$ (Chromatic Neutrality):

$\mu^c = \mathrm{mean}_x[U_I^c(x)],\quad \mu^g = \tfrac13\sum_c \mu^c$

$\mathcal L_G = \tfrac13\sum_{c\in\{R,G,B\}} |\mu^c - \mu^g|$

This loss penalizes deviation from global channel-wise mean equality, enforcing achromatic (gray-world) neutrality.

Luminous Loss $\mathcal L_L$ (Exposure Consistency):

$\mathcal L_L = \tfrac13\sum_{c}\frac1N\sum_{x}(U_I^c(x)-U_t^c)^2$

where $U_t^c$ is a mid-gray or white target (e.g., $0.5$ or $1.0$).

The total loss is a weighted sum:

$\mathcal L_T = \lambda_1 \mathcal L_G + \lambda_2 \mathcal L_L,$

with $\lambda_1=0.25$ , $\lambda_2=1$ . The network is optimized with Adam at a learning rate of $1\times 10^{-3}$ for 150 iterations with batch size $8$.

Domain-invariance is enforced by sampling mixed minibatches from all eight training datasets, encompassing a range of water types and illumination regimes, and exclusively utilizing loss functions that generalize across domains.

4. Integration within the DIVER Framework

IlluminateNet serves as one of two initial illumination correction modules within the DIVER pipeline (Makam et al., 30 Jan 2026). For each input, an Illumination Assessment Gate computes average red, green, and blue values $R_{\mathrm{avg}}, G_{\mathrm{avg}}, B_{\mathrm{avg}}$ to assess scene lighting. If $R_{\mathrm{avg}} < G_{\mathrm{avg}} / 5$ or $R_{\mathrm{avg}} < B_{\mathrm{avg}} / 5$ , indicating pronounced low-light conditions, the pipeline invokes IlluminateNet; otherwise, a Spectral Equalization Filter (SEF) is used. The output, $U_I$ (for low light) or $U_S$ (for well-lit), is passed to the Adaptive Optical Correction Module (AOCM) for hue and contrast refinement, and subsequently to Hydro-OpticNet for physics-guided dehazing and attenuation compensation.

5. Empirical Performance and Ablation Studies

IlluminateNet demonstrates substantial improvements in quantitative and qualitative metrics. On the low-light SeaThru dataset, the incorporation of IlluminateNet boosts UCIQE from $0.1062$ (raw) to $0.7007$ ( $\times 6.6$ ) and UIQM from $0.9980$ to $2.3685$. Further downstream modules slightly modify these scores, with full DIVER culminating at UCIQE $0.8470$ and UIQM $2.8685$. On UFO-120, state-of-the-art methods exhibit only incremental gains, whereas integration of IlluminateNet (via SEF + AOCM + Hydro-OpticNet) yields increases in PSNR from $12.67$ dB (raw) to $23.69$ dB and UCIQE to $0.9620$. Color-chart fidelity on SeaThru is also improved, with GPMAE (geodesic color error) reducing from raw errors $>20^\circ$ to $2$- $4^\circ$ within DIVER.

For robotic perception tasks, such as ORB-based keypoint repeatability and matching, IlluminateNet alone significantly increases the number of stable keypoints detected—from single digits in raw input to hundreds after correction, and over $1,000$ with full DIVER processing. This increase in repeatable matches suggests improved robustness for vision-based robotic tasks in challenging underwater conditions.

6. Data Regime, Optimization, and Domain-Generalization

IlluminateNet is trained using unpaired underwater images sourced from eight diverse datasets: SeaThru, OceanDark, USOD10K, U45, FISHTRAC, UIEB, UFO-120, and LSUI. Images are resized or cropped, and ambient light $A$ is computed from the most distant pixels inferred by a depth model. No ground truth references are utilized. The training configuration—characterized by global minibatch mixing and loss terms independent of water type—promotes domain-invariant operation, with the module maintaining generalized performance across varied aquatic settings.

7. Summary and Context within Underwater Enhancement

IlluminateNet is a lightweight, standalone CNN module embedded in the DIVER architecture for robust, domain-invariant luminance and color restoration of underwater images. It operates via a learnable atmospheric-light map plus a transmission-guided residual, and is trained using simple, physically motivated losses. Its contribution is critical to DIVER’s superior performance over prior state-of-the-art methods, accounting for more than $6\times$ gain in UCIQE and over $70\%$ reductions in chromatic error on challenging benchmarks. Its unsupervised, domain-agnostic training methodology and transparent physical modeling differentiate it from previous approaches and underpin its effectiveness for both human and machine-based downstream applications (Makam et al., 30 Jan 2026).

Markdown Report Issue Upgrade to Chat

References (1)

Development of Domain-Invariant Visual Enhancement and Restoration (DIVER) Approach for Underwater Images (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to IlluminateNet.