Hierarchical Graph ODE (HiGO)

Updated 11 January 2026

HiGO is a hierarchical framework that integrates multi-level graph structures with Neural ODEs to capture continuous-time spatiotemporal dynamics.
It employs adaptive, context-sensitive message passing to effectively fuse local details with global context for robust wildfire forecasting.
Empirical evaluations reveal state-of-the-art performance, achieving higher Macro-F1 and AUPRC scores compared to conventional methods.

The Hierarchical Graph ODE (HiGO) is a machine learning framework for modeling multi-scale, continuous-time spatiotemporal dynamics, specifically designed for applications such as global wildfire activity prediction. HiGO integrates a multi-level graph hierarchy with context-sensitive, adaptive message-passing mechanisms and Neural ODE modules parameterized by graph neural networks (GNNs), enabling effective feature extraction and information fusion across spatial scales. This approach allows HiGO to represent the Earth system as a series of interconnected graph representations, each capturing progressively coarser contextual information, and to model the inherent continuous-time evolution of wildfire activity. Empirical evaluation on the SeasFire Cube dataset demonstrates that HiGO achieves state-of-the-art results, significantly outperforming point-wise, vision-based, and conventional graph-based baselines in long-range wildfire forecasting and continuous-time interpolation (Xu et al., 4 Jan 2026).

1. Multi-Level Graph Hierarchy

HiGO represents the Earth system as a pyramid of graphs $\mathcal{G} = \{G^{(1)}, G^{(2)}, \dots, G^{(L)}\}$ :

Level 1 ( $G^{(1)}$ ): A regular $H \times W$ latitude–longitude grid, with nodes $V^{(1)}$ corresponding to individual grid cells and intra-level edges $E^{(1)}$ forming 4-connected neighborhoods.
Coarser Levels ( $G^{(l+1)}$ ): Each coarse node $v^{(l+1)}_{i,j}$ aggregates a non-overlapping $2 \times 2$ block of its children in $G^{(l)}$ , maintaining a 4-way adjacency structure at each resolution.
Inter-Level Connections ( $E^{(l,l+1)}$ ): Explicit edges connect each node $v^{(l)}_k$ in level $l$ to its parent $p(v^{(l)}_k)$ in level $l+1$ , enabling cross-scale information flow.
Node/Edge Features: Each node carries a $D$ -dimensional feature vector $x_i^{(l)}$ ; edges carry scalar features $e_{ij}^{(l)}$ .

This hierarchical construction addresses the multi-scale nature of wildfire phenomena, which are influenced by local conditions (fuel, moisture), regional weather, and global teleconnections. A single-scale graph cannot feasibly represent both fine-grained patterns and large-scale dependencies without excessive computational cost. HiGO's hierarchy facilitates both local detail and global context efficiently, supporting multi-scale feature fusion.

2. Adaptive Filtering Message Passing

HiGO employs context-aware adaptive message passing both within and across graph hierarchy levels:

Intra-Level Message Passing (Adaptive Message Passing, AdMP)

Attention Computation: For node $i$ with neighbors $j$ in the same level $l$ , the raw attention score is computed as

$w_{ij}^{(l)} = \phi_{\mathrm{edge}}^{(l)}(x_{i}^{(l)}, x_{j}^{(l)}, e_{ij}^{(l)}),$

where $\phi_{\mathrm{edge}}^{(l)}$ is a multilayer perceptron (MLP).

Normalization: The attention coefficients are normalized over $i$ 's neighborhood:

$\alpha_{ij}^{(l)} = \frac{\exp(w_{ij}^{(l)})}{\sum_{k\in\mathcal{N}_i} \exp(w_{ik}^{(l)})}.$

Message Aggregation and Node Update: The aggregated message is

$m_i^{(l)} = \sum_{j \in \mathcal{N}_i} \alpha_{ij}^{(l)} \widehat{e}_{ij}^{(l)},$

and node features are updated via

$x_i^{(l),\mathrm{new}} = \phi_{\mathrm{node}}^{(l)}(x_i^{(l)}, m_i^{(l)}).$

Inter-Level Information Flow

Downsampling (Fine to Coarse): For each coarse node $v_i^{(l+1)}$ pooling its children $\mathcal{C}(v_i^{(l+1)})$ :

$\beta_k^{(l)} = \phi_{\mathrm{down}}^{(l)}(\{x_k^{(l)}: k \in \mathcal{C}(v_i^{(l+1)})\}),$

(normalized over children), and then

$x_i^{(l+1)} = \sum_{k\in\mathcal{C}(v_i^{(l+1)})} \beta_k^{(l)} x_k^{(l)}.$

Upsampling (Coarse to Fine): Each child $k$ ’s features are updated as

$\widetilde{x}_k^{(l)} = \mathrm{LayerNorm}\left((1-\beta_k^{(l)}) x_k^{(l)} + \beta_k^{(l)} x_{p(k)}^{(l+1)}\right).$

All attention coefficients $\alpha_{ij}^{(l)}$ and $\beta_k^{(l)}$ act as dynamic filters, adaptively gating information flow between nodes or across scales.

3. Neural ODE Parameterization and Continuous-Time Modeling

HiGO models continuous-time dynamics using neural ODEs parameterized by GNNs:

Per-Level ODE: For each level $l$ ,

$\frac{dX^{(l)}(t)}{dt} = f_{\theta}^{(l)}(X^{(l)}(t), t; G^{(l)}),$

where $X^{(l)}(t)$ is the set of all node features and $f_\theta^{(l)}$ a GNN-based message-passing function.

Joint Multi-Level Integration: The complete hierarchical state $\mathcal{X}(t) = (X^{(1)}(t),...,X^{(L)}(t))$ evolves as

$\frac{d\mathcal{X}(t)}{dt} = [f_{\theta}^{(1)}, ..., f_{\theta}^{(L)}](\mathcal{X}(t), t).$

Level coupling occurs exclusively through inter-level edges in the graph, not via time in the ODE vector field.

Initialization and Solvers: Initial conditions fuse driver variables, climate indices, and the current burned-area map. The ODE integration utilizes the adaptive Dormand–Prince (RK45, "dopri5") solver, trading off computational cost and accuracy by dynamically adjusting step size.

This continuous-time formulation enables precise forecasting and interpolation of wildfire activity, adapting to the inherently non-uniform temporal evolution of such processes.

4. Training Procedure and Loss Formalism

The HiGO framework is optimized for point-wise multi-class (ordinal) classification over $K$ burned-area intervals at each grid cell:

Logits and Probabilities: For cell $(i,j)$ and class $k\in\{0,...,K-1\}$ ,

$\hat p_{i,j,k} = \mathrm{softmax}(\ell_{i,j,\cdot})_k .$

Loss Function: Weighted cross-entropy addresses severe class imbalance:

$\mathcal{L} = - \frac{1}{HW} \sum_{i=1}^H \sum_{j=1}^W \sum_{k=0}^{K-1} w_k\; \mathbf{1}[y_{i,j}=k]\,\log(\hat p_{i,j,k}),$

with weights $w_k$ inversely proportional to class frequency.

Regularization: Standard weight decay is applied to $\theta$ .
Forecast Scheduling: Forecast horizons are scheduled at 8, 16, ..., 48 days.

No auxiliary losses are required beyond this formulation.

5. Empirical Performance on Global Wildfire Forecasting

The SeaFire Cube dataset serves as the primary benchmarking corpus for HiGO:

Model Type	Baseline Methods
Point-wise	MLP, XGBoost
Vision-based	U-Net, ViT, Swin-Transformer
Graph-based	GCN, NDCN (Neural Dynamic on Complex Networks), GraphCast

Key evaluation metrics are Macro-F1 score (on the "fire" class) and area under precision-recall curve (AUPRC), computed using binary fire/no-fire labels.

Short-term (8d): HiGO attains M-F1 = 0.581 versus GraphCast's 0.575, AUPRC = 0.653 versus 0.631.
Long-range (48d): HiGO achieves M-F1 = 0.423 versus GraphCast's 0.384 (+3.9 points), AUPRC = 0.522 versus 0.474 (+4.8 points), with a consistently increasing margin at greater forecast horizons.
Continuous-Time Interpolation: When trained on 16-day data, HiGO evaluated at 8d yields M-F1 = 0.551 (NDCN: 0.493, GraphCast: 0.484), and at 24d yields M-F1 = 0.507 (NDCN: 0.486, GraphCast: 0.459).
Observational Consistency: The method produces robust, physically consistent, continuous-time predictions.

6. Discussion, Significance, and Generalization

HiGO's stable continuous-time predictions arise from its single GNN-ODE formulation of the system's vector field $\dot{\mathcal{X}} = f_\theta(\mathcal{X}, t)$ . This avoids the error-amplifying behavior of discrete recurrent schemes, and the adaptive-step ODE solvers automatically refine integration where system dynamics become stiff (e.g., rapid wildfire spread) while economizing computation in slowly evolving regimes.

Potential generalizations include application to any spatiotemporal phenomenon characterized by multi-scale coupling and continuous dynamics: examples include precipitation nowcasting, oceanic/atmospheric modeling, pollutant dispersion, epidemic modeling, and ecological invasion dynamics. The hierarchical pooling scheme could be replaced with alternative graph coarsening strategies for irregular domains, such as adaptive quadtrees or hemispherical meshes. Incorporation of physics-informed regularizers (e.g., enforcing conservation laws) into the ODE loss is another avenue for future research.

HiGO represents an overview of multi-level graph modeling, adaptive filtering message passing, and continuous-time neural ODE integration, enabling state-of-the-art performance on challenging continuous-time forecasting benchmarks in Earth science domains (Xu et al., 4 Jan 2026).

Markdown Report Issue Upgrade to Chat

References (1)

Advanced Global Wildfire Activity Modeling with Hierarchical Graph ODE (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Hierarchical Graph ODE (HiGO).