IGraSS: Learning to Identify Infrastructure Networks from Satellite Imagery by Iterative Graph-constrained Semantic Segmentation

Published 9 Jun 2025 in cs.CV and cs.AI | (2506.08137v2)

Abstract: Accurate canal network mapping is essential for water management, including irrigation planning and infrastructure maintenance. State-of-the-art semantic segmentation models for infrastructure mapping, such as roads, rely on large, well-annotated remote sensing datasets. However, incomplete or inadequate ground truth can hinder these learning approaches. Many infrastructure networks have graph-level properties such as reachability to a source (like canals) or connectivity (roads) that can be leveraged to improve these existing ground truth. This paper develops a novel iterative framework IGraSS, combining a semantic segmentation module-incorporating RGB and additional modalities (NDWI, DEM)-with a graph-based ground-truth refinement module. The segmentation module processes satellite imagery patches, while the refinement module operates on the entire data viewing the infrastructure network as a graph. Experiments show that IGraSS reduces unreachable canal segments from around 18% to 3%, and training with refined ground truth significantly improves canal identification. IGraSS serves as a robust framework for both refining noisy ground truth and mapping canal networks from remote sensing imagery. We also demonstrate the effectiveness and generalizability of IGraSS using road networks as an example, applying a different graph-theoretic constraint to complete road networks.

Abstract PDF Upgrade to Chat

Summary

The paper introduces an iterative framework that couples deep segmentation with graph-based label refinement to enforce global connectivity.
It leverages multimodal inputs (RGB, NDWI, DEM) to boost performance, achieving up to a 10% improvement in key metrics.
Experiments demonstrate significant reduction in unreachable network segments, enhancing mapping accuracy for canals and roads.

IGraSS: Iterative Graph-constrained Semantic Segmentation for Infrastructure Network Mapping from Remote Sensing

Problem Domain and Motivation

Accurate mapping of infrastructure networks such as irrigation canals and roads is a foundational requirement for water management, agricultural modernization, urban planning, and disaster mitigation. State-of-the-art semantic segmentation models for infrastructure mapping via high-resolution satellite imagery are fundamentally limited by the availability of complete, high-quality ground truth. In domains such as canal mapping, manual annotation is labor-intensive, suffers from incomplete connectivity, and often exhibits noisy or fragmented labels that impede deep learning model performance. IGraSS (Iterative Graph-constrained Semantic Segmentation) addresses this critical bottleneck by introducing a hybrid, iterative framework that couples multimodal semantic segmentation with a graph-theoretic optimization module to yield self-improving annotations and robust extraction of network topology, even under sparse or broken supervision.

IGraSS is an iterative two-module framework alternating between a deep learning-based semantic segmentation (the Learner) and a structure-aware graph-based ground truth refinement mechanism (the Constraint Solver). At the core, the approach recognizes that network-like infrastructure (canals, roads) must obey overt graph constraints such as reachability from sources or overall connectivity—global properties typically ignored by conventional pixelwise losses.

The processing loop of IGraSS proceeds as follows:

Segmentation Module: Utilizes contemporary architectures (DeepLabV3+, ResUNet, Swin Transformer) with multimodal remote sensing inputs—RGB, NDWI (Normalized Difference Water Index), and DEM (Digital Elevation Model)—for per-pixel prediction over image patches.
Graph Aggregation and Completion: The outputs are stitched and thresholded to represent a network mask. The region is overlaid with a grid (Moore neighborhood adjacency), forming a graph $G$ , with predicted canal/road pixels as nodes. Unreachable subcomponents are identified, and candidates for completion (broken “terminals”) are computed. For each identified unreachable terminal, a local subgraph is constructed and shortest path algorithms, with edge/node weights derived from the segmentation confidence, are used to fuse unreachable segments to reachable components.
Label Augmentation and Iteration: Paths found by the completion algorithm are used as pseudo-labels, augmenting the training mask. After each iteration, the Learner is retrained using the refined ground truth, and the process repeats. Adaptive confidence and spatial radius thresholds are employed for robust label refinement.
Evaluation and Stopping: The framework iterates until convergence (e.g., minimal unreachable fraction), with metrics monitored across iterations.
Figure 2: An example visualization of IGraSS network completion: blue—reachable, red—unreachable, green/yellow/pink indicate iterative completion steps.

The framework design guarantees that segmentation models progressively receive more topologically consistent and connected supervision, thus overcoming limitations of fragmented or missing labels.

Novelty in Metrication: r-Neighborhood Evaluation

Recognition that infrastructure lines are annotated as thin, single-pixel structures prompted the introduction of parameterized $r$ -neighborhood metrics. These relax strict pixelwise coincidence, allowing predictions within a radius $r$ to count as true positives and thereby reflecting the importance of structural alignment over subpixel accuracy. This is critical in geospatial domains, where small spatial offsets in ground truth or predictions can otherwise dramatically degrade reported scores.

Figure 1: r-neighborhood metrics more accurately reflect the perceptual and topological quality of thin-structure segmentation, alleviating harsh penalization from minor misalignments.

Quantitative Results and Empirical Analysis

Baseline Performance and Ground Truth Improvement

Experiments were conducted on a large ( $\sim$ 30,000) patch dataset of 3m-resolution PlanetScope imagery for central Washington's canal network, with additional NDWI and DEM channels. To quantify IGraSS’s capacity for improvement, each Learner model was trained with both the original and the iteratively-refined ground truth.

Swin Transformer with IGraSS-refined ground truth achieved: precision from 0.775 to 0.820, recall from 0.765 to 0.810, F1 from 0.770 to 0.815, IoU from 0.720 to 0.760 (Test Set 1).
Across all models, 5–10% absolute improvement was observed in all standard and $r$ -neighborhood metrics upon adopting refined labels.

Network completion analysis reveals that after 5 refinement rounds, the proportion of unreachable canal pixels was reduced from $\sim$ 18% to below 3%. The iterative process exhibited monotonic reduction of disconnected network segments (see Figure 4).

Figure 3: Quantitative evidence of improved network reachability after successive IGraSS iterations for canal segmentation.

Ablation: Effect of Multimodal Inputs

Addition of NDWI, sensitive to water, boosted IoU by 35–40%; DEM contributed an additional 10–20%. Using both NDWI and DEM yielded up to 50% cumulative improvement compared to RGB-only baselines, highlighting the critical role of exogenous spectral and physical context in hydro-infrastructure extraction.

Parameter Studies and Error Analysis

Parameter sweeps on spatial radius, confidence threshold, and number of inner epochs demonstrated trade-offs between noise (false joins) and coverage (connectivity restoration). Lower confidence thresholds accelerated connection of terminals but, if not properly annealed, occasionally introduced spurious connections (see Figure 5).

Figure 4: Error analysis highlights failure cases under suboptimal thresholding—erroneous spurious joins can occur if parameter tuning is poor.

Generalizability: Road Network Completion

Road network datasets (NYC, OpenStreetMap, random edge removal) were used to demonstrate that IGraSS is agnostic to the nature of the network, provided a graph-theoretic constraint is available (e.g., minimal all-pairs shortest-path). Restoration of urban road connectivity and shortest path distribution (Figure 6) validated IGraSS’s capacity to optimize for various topological criteria beyond reachability.

Figure 5: IGraSS restores disrupted road networks, driving shortest path statistics toward the original network's values via iterative completion.

Theoretical and Practical Implications

IGraSS fuses local per-pixel learning and global structure satisfaction, enabling semantic segmentation in situations where full, correct supervision is unavailable and classic CRF/graph regularization on output logits is insufficient. Theoretical implications include:

Decoupled, post-hoc constraint satisfaction: Rather than augmenting network loss with penalties, IGraSS operates outside the loop of SGD, iteratively correcting training data itself.
Data-centric AI paradigm: Performance is most directly improved not by changing model architecture, but by self-bootstrapping more consistent, constraint-compliant ground truth.
General applicability: The iterative pseudo-labeling/optimization loop is applicable across domains with explicit graph/structural priors, including utility mapping, ecological monitoring, and disaster response.

Practically, this enables robust infrastructure mapping from noisy, incomplete, or out-of-date vector data—critical for regions where ground truth is sparse or human annotation is prohibitively expensive. The framework is highly scalable, leveraging batch inference and modular plug-in graph modules compatible with existing geospatial data pipelines.

Limitations and Future Directions

While IGraSS corrects weak supervision and recovers global topology, several limitations exist:

Dependence on explicit constraint specification: IGraSS requires well-defined global properties (e.g., all canals must be connected to a water source) and labeled source points. In the absence of clear constraints, its benefit diminishes.
Potential error propagation: Aggressive pseudo-labeling or incorrect initial segmentation can propagate noise if parameters (confidence/radius/iteration count) are not well-calibrated.
Invisibility of certain infrastructure: Sections of networks obscured by vegetation, structures, or image artifacts are irrecoverable by any image-driven method and will remain disconnected in final outputs.

Future research directions include dynamic/learned constraint discovery, integration with multimodal/multitemporal satellite products, modeling of temporal changes in infrastructure, and automated parameter tuning (e.g., via meta-learning or Bayesian optimization).

Conclusion

IGraSS demonstrates that iterative, graph-aware pseudo-labeling is an effective paradigm for learning and correcting sparse or broken infrastructure supervision in remote sensing contexts. By enforcing global connectivity or reachability during iterative annotation refinement, the framework ensures that deep models both respect domain-specific structure and extract more complete, operationally useful infrastructure maps. Its broader significance extends to any setting where network structure is fundamental but high-quality annotation is lacking, making it a key advance in data-centric geospatial AI.

Figure 6: Visualization of IGraSS’s completion of disconnected canals via iterative shortest path connection of unreachable segments to the main network.