Papers
Topics
Authors
Recent
Search
2000 character limit reached

IlluminateNet: Unsupervised Underwater Enhancer

Updated 6 February 2026
  • IlluminateNet is a fully unsupervised CNN module that enhances underwater images by adaptively correcting luminance and restoring color balance.
  • It utilizes a dual-stream architecture with a channel-wise atmospheric-light estimator and transmission map estimation for global and local corrections.
  • Empirical results show significant improvements in UCIQE, UIQM, and keypoint repeatability, boosting performance in robotic vision and underwater imaging tasks.

IlluminateNet is a fully unsupervised convolutional neural network (CNN) module designed for adaptive luminance enhancement in underwater images. Developed within the DIVER (Domain-Invariant Visual Enhancement and Restoration) framework, IlluminateNet aims to recover achromatic, brightness-balanced renderings of severely degraded raw underwater photographs. It achieves domain-invariant correction of illumination and color loss caused by wavelength-dependent attenuation, scattering, and illumination non-uniformity encountered in diverse aquatic environments, including shallow, deep, and turbid scenes. IlluminateNet is specifically invoked for low-light scenarios, delivering substantial improvements in both perceptual quality and downstream robotic vision metrics (Makam et al., 30 Jan 2026).

1. Architectural Structure and Processing Workflow

IlluminateNet operates on raw underwater RGB images URRH×W×3U_R\in\mathbb R^{H\times W\times 3} and produces an illumination-corrected output UIRH×W×3U_I\in\mathbb R^{H\times W\times 3}. The architecture consists of the following primary components:

  1. Channel-wise Atmospheric-Light Estimator: Each color channel (RR, GG, BB) of URU_R is processed by a small per-channel CNN consisting of 3×33\times3 convolutional layers and ReLU nonlinearities, generating feature maps ϕc(x)\phi_c(x). These are concatenated and passed through an element-wise tanh\tanh nonlinearity to compute a global atmospheric light estimate:

UG(x)=tanh(ϕR(x)ϕG(x)ϕB(x)).U_G(x) = \tanh\big(\phi_R(x)\,\Vert\,\phi_G(x)\,\Vert\,\phi_B(x)\big).

The tanh\tanh activation constrains output to (1,1)(-1,1) per channel.

  1. Transmission Map Estimation (Hybrid Rule + CNN): A patch-max difference is computed relative to an “ambient light” statistic AA, defined as the mean of the top-0.1% farthest-depth pixels according to a learned depth model:

T(x)=maxy:yxrUR(y)Amax(A,1A).T(x) = \max_{y : \|y-x\|\le r} \frac{|U_R(y) - A|}{\max(A, 1-A)}.

This is further refined by a 3×33\times3 CNN layer with ReLU to yield a smoothed transmission map T~(x)\tilde T(x).

  1. Luminance Residual Computation: The transmission-guided luminance residual is computed pointwise as

UT(x)=max(0,UR(x)UG(x)T~(x)).U_T(x) = \max \left(0, \frac{U_R(x) - U_G(x)}{\tilde T(x)} \right).

  1. Fusion and Skip Connection: The final illumination-corrected image is obtained via additive fusion (“residual skip”) of UGU_G and UTU_T:

UI(x)=UG(x)+UT(x).U_I(x) = U_G(x) + U_T(x).

ReLU activations are employed throughout all convolutional layers, with a tanh\tanh nonlinearity exclusively at the atmospheric-light stream output.

This architecture decouples the estimation of global illumination (via UGU_G) from local, transmission-guided correction (via UTU_T), supporting compensation for both spatially global and locally variant degradations.

2. Mathematical Model and Image Formation

IlluminateNet’s formulation is based on a simplified underwater image formation prior derived in DIVER:

UR(x)=UI(x)T(x)+UG(x)[1T(x)].U_R(x) = U_I(x)\,T(x) + U_G(x)[1-T(x)].

Solving for UIU_I:

UI(x)=UG(x)+UR(x)UG(x)T(x).U_I(x) = U_G(x) + \frac{U_R(x) - U_G(x)}{T(x)}.

A patch-based maximum transmission map is estimated:

T(x)=maxyψ(x)UR(y)Amax(A,1A),T(x) = \max_{y \in \psi(x)} \frac{|U_R(y) - A|}{\max(A, 1-A)},

where ψ(x)\psi(x) is the neighborhood of pixel xx. The global-light estimate is produced by

UG(x)=tanh(ReLU(URR)ReLU(URG)ReLU(URB)).U_G(x) = \tanh(\mathrm{ReLU}({U_R^R})\Vert \mathrm{ReLU}({U_R^G})\Vert \mathrm{ReLU}({U_R^B})).

Residual smoothing is performed as:

UT(x)=ReLU(UR(x)UG(x)T~(x)).U_T(x) = \mathrm{ReLU}\left(\frac{U_R(x)-U_G(x)}{\tilde T(x)}\right).

This model provides an interpretable mechanism for disentangling illumination correction from color channel balancing, and it enforces physical plausibility in the correction process.

3. Loss Functions and Unsupervised Training Paradigm

IlluminateNet leverages only unpaired underwater images for training, relying on unsupervised objectives that avoid dependence on reference clean ground truth. Two complementary losses are used:

  1. Gray-World Loss LG\mathcal L_G (Chromatic Neutrality):

μc=meanx[UIc(x)],μg=13cμc\mu^c = \mathrm{mean}_x[U_I^c(x)],\quad \mu^g = \tfrac13\sum_c \mu^c

LG=13c{R,G,B}μcμg\mathcal L_G = \tfrac13\sum_{c\in\{R,G,B\}} |\mu^c - \mu^g|

This loss penalizes deviation from global channel-wise mean equality, enforcing achromatic (gray-world) neutrality.

  1. Luminous Loss LL\mathcal L_L (Exposure Consistency):

LL=13c1Nx(UIc(x)Utc)2\mathcal L_L = \tfrac13\sum_{c}\frac1N\sum_{x}(U_I^c(x)-U_t^c)^2

where UtcU_t^c is a mid-gray or white target (e.g., $0.5$ or $1.0$).

The total loss is a weighted sum:

LT=λ1LG+λ2LL,\mathcal L_T = \lambda_1 \mathcal L_G + \lambda_2 \mathcal L_L,

with λ1=0.25\lambda_1=0.25, λ2=1\lambda_2=1. The network is optimized with Adam at a learning rate of 1×1031\times 10^{-3} for 150 iterations with batch size $8$.

Domain-invariance is enforced by sampling mixed minibatches from all eight training datasets, encompassing a range of water types and illumination regimes, and exclusively utilizing loss functions that generalize across domains.

4. Integration within the DIVER Framework

IlluminateNet serves as one of two initial illumination correction modules within the DIVER pipeline (Makam et al., 30 Jan 2026). For each input, an Illumination Assessment Gate computes average red, green, and blue values Ravg,Gavg,BavgR_{\mathrm{avg}}, G_{\mathrm{avg}}, B_{\mathrm{avg}} to assess scene lighting. If Ravg<Gavg/5R_{\mathrm{avg}} < G_{\mathrm{avg}} / 5 or Ravg<Bavg/5R_{\mathrm{avg}} < B_{\mathrm{avg}} / 5, indicating pronounced low-light conditions, the pipeline invokes IlluminateNet; otherwise, a Spectral Equalization Filter (SEF) is used. The output, UIU_I (for low light) or USU_S (for well-lit), is passed to the Adaptive Optical Correction Module (AOCM) for hue and contrast refinement, and subsequently to Hydro-OpticNet for physics-guided dehazing and attenuation compensation.

5. Empirical Performance and Ablation Studies

IlluminateNet demonstrates substantial improvements in quantitative and qualitative metrics. On the low-light SeaThru dataset, the incorporation of IlluminateNet boosts UCIQE from $0.1062$ (raw) to $0.7007$ (×6.6\times 6.6) and UIQM from $0.9980$ to $2.3685$. Further downstream modules slightly modify these scores, with full DIVER culminating at UCIQE $0.8470$ and UIQM $2.8685$. On UFO-120, state-of-the-art methods exhibit only incremental gains, whereas integration of IlluminateNet (via SEF + AOCM + Hydro-OpticNet) yields increases in PSNR from $12.67$ dB (raw) to $23.69$ dB and UCIQE to $0.9620$. Color-chart fidelity on SeaThru is also improved, with GPMAE (geodesic color error) reducing from raw errors >20>20^\circ to $2$-44^\circ within DIVER.

For robotic perception tasks, such as ORB-based keypoint repeatability and matching, IlluminateNet alone significantly increases the number of stable keypoints detected—from single digits in raw input to hundreds after correction, and over $1,000$ with full DIVER processing. This increase in repeatable matches suggests improved robustness for vision-based robotic tasks in challenging underwater conditions.

6. Data Regime, Optimization, and Domain-Generalization

IlluminateNet is trained using unpaired underwater images sourced from eight diverse datasets: SeaThru, OceanDark, USOD10K, U45, FISHTRAC, UIEB, UFO-120, and LSUI. Images are resized or cropped, and ambient light AA is computed from the most distant pixels inferred by a depth model. No ground truth references are utilized. The training configuration—characterized by global minibatch mixing and loss terms independent of water type—promotes domain-invariant operation, with the module maintaining generalized performance across varied aquatic settings.

7. Summary and Context within Underwater Enhancement

IlluminateNet is a lightweight, standalone CNN module embedded in the DIVER architecture for robust, domain-invariant luminance and color restoration of underwater images. It operates via a learnable atmospheric-light map plus a transmission-guided residual, and is trained using simple, physically motivated losses. Its contribution is critical to DIVER’s superior performance over prior state-of-the-art methods, accounting for more than 6×6\times gain in UCIQE and over 70%70\% reductions in chromatic error on challenging benchmarks. Its unsupervised, domain-agnostic training methodology and transparent physical modeling differentiate it from previous approaches and underpin its effectiveness for both human and machine-based downstream applications (Makam et al., 30 Jan 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to IlluminateNet.