Multi-Scale GNN Ocean Model

Updated 26 January 2026

Multi-scale graph neural network-based ocean models are innovative frameworks that transform regular ocean grids into hierarchically structured graphs capturing both slow thermohaline changes and fast dynamic phenomena.
They integrate oceanographic domain knowledge with encoder-processor-decoder architectures and multi-scale message-passing to improve subseasonal-to-seasonal forecasts and eddy-resolving predictions.
These models employ advanced techniques such as hierarchical pooling, adaptive cross-scale communication, and optimized loss functions to achieve significant computational speed-ups and high simulation fidelity.

A multi-scale graph neural network-based ocean model employs graph-theoretic data representations and hierarchical message-passing neural networks to simulate the ocean’s spatiotemporal dynamics across a wide range of spatial and temporal scales. These architectures fuse oceanographic domain knowledge with machine learning primitives to emulate both slow-evolving thermohaline processes and fast, multiscale phenomena such as eddy formation, wave propagation, and boundary-layer interactions. Recent implementations advance subseasonal-to-seasonal simulation accuracy, eddy-resolving short-term forecasts, and efficient surrogate modeling for parameter space exploration (Gao et al., 27 May 2025, Hirabayashi et al., 19 Jan 2026, Holmberg et al., 2024, Shi et al., 2022).

1. Multi-Scale Graph Construction Principles

The foundation of multi-scale GNN ocean modeling is a discretization scheme that transforms ocean variables, typically recorded on regular latitude–longitude grids, into graph-based domains. Nodes represent grid cells or mesh elements, while edges encode local (neighboring) and nonlocal (teleconnection, long-range) interactions depending on the model variant (Gao et al., 27 May 2025). Edge sets may include:

Intra-mesh edges: connections for immediate, medium, and long-range neighbors to encode multiple spatial scales.
Grid-to-mesh and mesh-to-grid edges: “lifting” and “projection” operations that transfer information between regular grids and higher-connectivity meshes, facilitating hierarchical aggregation and refinement.

Node features at each spatial location consist of ocean state variables (multi-depth scalar fields, sea surface height) and atmospheric forcing (e.g. wind, temperature, mean sea-level pressure) (Hirabayashi et al., 19 Jan 2026). Edge features typically contain positional displacement and, in some models, learned latent representations to capture directional interaction or physical coupling.

Hierarchical construction forms a cascade of coarse-to-fine graphs, each representing ocean dynamics at distinct spatial resolutions. Coarsening is accomplished via cell-grid pooling, graph clustering, or mesh element grouping (e.g. unstructured mesh aggregation in GNN-Surrogate (Shi et al., 2022)). Cross-scale message-passing enables the model to exchange information between levels using bipartite graphs or pooling/unpooling operators (Holmberg et al., 2024, Cuervo-Londoño et al., 30 May 2025).

2. Neural Model Architecture and Temporal Frameworks

Core to the multi-scale ocean GNN is an encoder-processor-decoder design (Hirabayashi et al., 19 Jan 2026, Gao et al., 27 May 2025):

Encoder: Transforms input features from regular grids into mesh embeddings. Node features and edge attributes are mapped via multilayer perceptrons (MLPs) into a shared latent space of dimension $d_{\rm model}$ .
Processor: Implements repeated message-passing over mesh graphs. NeuralOM employs 16 multi-scale interactive messaging (MIM) blocks, each combining edge-level physical process updates (differential, multiplicative, and alignment terms via LayerNorm and SiLU nonlinearities) and node-level aggregation modules (sum for small-scale, mean for large-scale, adaptively gated). Other architectures alternate multi-scale graph convolutions, hierarchical pooling/unpooling, and temporal convolution layers to capture spatiotemporal dependencies (Ning et al., 2023, Cachay et al., 2020).
Decoder: Projects mesh embeddings back to grid nodes, generating the forecast via a final MLP.

Temporal frameworks are tailored to oceanic thermal inertia and anomaly propagation:

Multi-stage rollouts with residual corrections model the slow-changing nature of ocean states (Gao et al., 27 May 2025). Early stages predict large-scale changes, while later stages apply residual GNN-based corrections for finer anomalies.
Autoregressive pipelines predict multiple steps ahead by leveraging previous states and atmospheric forcings (Hirabayashi et al., 19 Jan 2026, Holmberg et al., 2024, Cuervo-Londoño et al., 30 May 2025).

3. Message Passing, Edge and Node Update Formulations

The canonical update equations in multi-scale GNN ocean models are rooted in physical intuition and statistical aggregation:

For a directed edge $i\to j$ : $\begin{aligned} h_{d(i)} &:= h_{s(i)} - h_{r(i)} \ h_{mp(i)} &:= h_{s(i)} \odot h_{r(i)} \ h_{cos(i)} &:= \frac{h_{s(i)} \cdot h_{r(i)}}{\|h_{s(i)}\| \|h_{r(i)}\|} \end{aligned}$ These are fused through a residual MLP and propagated via message-passing mechanisms (Gao et al., 27 May 2025). Node updates balance rapid small-scale sum aggregation with slow large-scale mean aggregation, modulated by adaptive gating: $h_i' = \mathrm{MLP}_{\rm node}(\gamma_i h_{\rm sum}' + (1 - \gamma_i) h_{\rm mean}')$ with $\gamma_i = \mathrm{sigmoid}(W_g[h_{\rm sum}'; h_{\rm mean}'] + b_g)$ .

Multi-scale architectures (e.g., GNN-Surrogate (Shi et al., 2022)) utilize edge-conditioned convolutions to incorporate static geographical attributes and orientation. Hierarchical pooling/unpooling (e.g., DiffPool, Graph U-Net) re-aggregate node features via trainable soft clustering, enabling information flow across spatial scales (Ning et al., 2023, Holmberg et al., 2024).

Temporal dependencies are explicitly handled by gated 1D convolutions (dilated causal) or sequential stacking in input features. Spatio-temporal blocks alternate these with spatial graph convolutions across multiple edge scales (Cachay et al., 2020).

4. Training Objectives, Loss Functions, and Regularization

Training objectives are dominated by pointwise mean-squared error (MSE) or relative $L_2$ losses over all grid nodes, vertical levels, and variables: $\mathcal{L}_2 = \frac{1}{KHW} \sum_{k=1}^K \sum_{i=1}^H \sum_{j=1}^W \frac{(\hat{O}_{i,j,k}^t-O_{i,j,k}^t)^2}{(O_{i,j,k}^t)^2}$ No physics-based regularizers are used in NeuralOM (Gao et al., 27 May 2025), whereas other models introduce explicit Dirichlet loss terms or edge-gradient losses for boundary handling and spatial consistency (Lino et al., 2022).

Adaptive-resolution strategies optimize memory and computation by representing invariant regions at coarser scales, determined by reference field analysis and graph hierarchy tree cuts (Shi et al., 2022). Standard optimizers include AdamW and RMSprop with learning-rate scheduling and weight decay. Masked loss is used to exclude land points and account for grid-cell area variations in irregular geometries (Holmberg et al., 2024, Cuervo-Londoño et al., 30 May 2025).

5. Evaluation Metrics, Benchmarks, and Computational Considerations

Performance is assessed using RMSE, anomaly correlation coefficient (ACC), critical success index (CSI), and symmetric extreme dependency index (SEDI) for extreme event evaluation (Gao et al., 27 May 2025). Kinetic energy spectrum analysis via Fourier transforms of velocity fields substantiates the preservation of multi-scale variance and eddy energy in forecast outputs (Hirabayashi et al., 19 Jan 2026).

Representative evaluation results include:

NeuralOM reduces 40–60 day RMSE by up to 11.6% over WenHai and increases ACC (e.g., 0.7158 vs. 0.8093 RMSE at 60 days) (Gao et al., 27 May 2025).
MultiScaleGNN achieves MAEs in the few percent range (0.0283–0.0346) for incompressible Navier–Stokes flows and 100–10,000× speed-up over classical PDE solvers (Lino et al., 2022).
SeaCast delivers 100–1,000× computational speed-up over Med-PHY, with comparable temperature and velocity skill (Holmberg et al., 2024).
GNN-Surrogate yields PSNR improvements and 1,500× runtime speed-up in unstructured-mesh ocean surrogate modeling (Shi et al., 2022).
Subregional models (e.g., GraphCast for Canary Current) achieve up to 76% RMSE reduction at 5-day lead vs. GLORYS reanalysis (Cuervo-Londoño et al., 30 May 2025).

Test protocols typically span multidecadal reanalysis (GLORYS12, ERA5, CMEMS datasets), with initial conditions randomized or temporally split. Qualitative pattern correlation and visual preservation of eddy structure are reported as additional metrics.

6. Physical Interpretability, Applicability, and Limitations

Multi-scale GNNs encode explicit physical interactions via edge-update modules, blending process-based message-passing with data-driven discovery (Gao et al., 27 May 2025). Teleconnection edges and adaptive graph topologies allow interpretability and insight into long-range climate modes, such as ENSO teleconnections (Cachay et al., 2020).

Applicability spans:

Subseasonal-to-seasonal global ocean simulation (NeuralOM, MultiScaleGNN).
Eddy-resolving global forecasting (Hirabayashi et al., 19 Jan 2026).
Regional operational models (SeaCast for Mediterranean, GraphCast for Canary Current).
Surrogate modeling for parameter space exploration on irregular meshes (GNN-Surrogate).

Limitations include:

Autoregressive error accumulation for long lead-time forecasts.
Potential mesh-induced artifacts and sensitivity to initial conditions.
Lack of explicit physics-informed loss in some implementations, limiting physical constraint enforcement.
Need for future investigation into hybrid physics–machine learning regularization and extension to adaptive and truly unstructured mesh architectures.

7. Implementation Details, Hyperparameters, and Operational Considerations

Model dimensionalities—grid or mesh node counts, latent dimension $d_{\rm model}$ , hidden size—are specified according to resolution and computational resources. Stage counts (e.g., Q=2 in NeuralOM), message-passing iterations (e.g., 16 in MIM blocks), and pooling levels are chosen empirically for each application.

Activation functions (SiLU, Swish, tanh, SELU), normalization (LayerNorm, batch norm), and gating mechanisms (sigmoid) are standard. Training regimens span 10–200 epochs, with large-scale parallelization (e.g., 64 A100 GPUs for NeuralOM), batch sizes of 1–240 (or variable for mesh-based models). Input anomaly normalization is consistently employed, subtracting multi-decadal climatologies before training to stabilize learning.

Code repositories are provided for reproducibility and extension (e.g., NeuralOM: github.com/YuanGao-YG/NeuralOM).

In sum, multi-scale graph neural network-based ocean models integrate hierarchical graph representations and physically motivated neural architectures to surpass traditional ocean modeling in speed and fidelity, offering a robust paradigm for both operational forecasting and scientific exploration across spatial and temporal scales (Gao et al., 27 May 2025, Hirabayashi et al., 19 Jan 2026, Holmberg et al., 2024, Shi et al., 2022, Cuervo-Londoño et al., 30 May 2025, Lino et al., 2022, Ning et al., 2023, Cachay et al., 2020, Chen et al., 2021).