Joint Learning of Blind Super-Resolution and Crack Segmentation for Realistic Degraded Images

Published 24 Feb 2023 in cs.CV, cs.AI, and eess.IV | (2302.12491v3)

Abstract: This paper proposes crack segmentation augmented by super resolution (SR) with deep neural networks. In the proposed method, a SR network is jointly trained with a binary segmentation network in an end-to-end manner. This joint learning allows the SR network to be optimized for improving segmentation results. For realistic scenarios, the SR network is extended from non-blind to blind for processing a low-resolution image degraded by unknown blurs. The joint network is improved by our proposed two extra paths that further encourage the mutual optimization between SR and segmentation. Comparative experiments with State of The Art (SoTA) segmentation methods demonstrate the superiority of our joint learning, and various ablation studies prove the effects of our contributions.

Abstract PDF HTML Upgrade to Chat

References (70)

Citations (4)

View on Semantic Scholar

Summary

The paper proposes a deep neural network framework that jointly learns blind super-resolution and crack segmentation end-to-end for images with unknown degradations.
Experimental evaluation shows the joint learning method significantly outperforms independent SR and segmentation approaches, improving metrics like IoU by approximately 10%.
This joint learning framework has significant implications for automated visual inspection systems, enabling improved crack detection in real-world degraded images for infrastructure monitoring.

This paper addresses the complex challenge of segmenting cracks in realistically-degraded images by proposing a method that jointly learns Blind Super-Resolution (SR) and Crack Segmentation. The significant contributions of this study lie in the emphasis on practical scenarios where images undergo unknown degradations such as blurring due to motion or out-of-focus conditions, which are commonplace in automatic inspection from vehicles like drones and cars.

Key Contributions and Methodology

Joint SR and Segmentation Framework: The study advances the current methodologies by integrating a deep neural network-based Super-Resolution with binary segmentation, allowing for end-to-end training. This integration extends from non-blind to blind SR, optimizing for segmentation performance even in scenarios with unknown blur.
Mutual Optimization Paths: To further enhance the network's performance, additional paths are introduced that optimize SR specifically for segmentation tasks. These paths ensure a two-way beneficial relationship between SR and segmentation processes.
Boundary Combo Loss: A novel Boundary Combo (BC) loss function is introduced. This loss function is designed to robustly handle class-imbalance problems by providing global constraints on the detection of fine boundaries, tailored to improve the detection of thin cracks amidst a predominance of non-crack pixels.
Segmentation-aware SR-loss Weights: By incorporating weights based on segmentation awareness, some challenges such as gradient vanishing during training can be mitigated. These weights help in directing the optimization process to better support both SR and segmentation simultaneously.
Blur-reflected Task Learning: The method proposes leveraging the blur characteristics estimated during SR for improving segmentation robustness. By skipping blur influences to the segmentation network, finer segmentation outputs are achieved.

Experimental Evaluation

The experimental component of this work underscores its comparative superiority to state-of-the-art segmentation methods. The joint learning framework consistently outperforms independent SR and segmentation solutions. Metrics such as IoU, PSNR, and Structural Similarity (SSIM) highlight the efficacy of the proposed method, with impressive improvements like an increase in IoU of approximately 10% over traditional non-blind SR-based methods.

Implications and Future Work

The implications of this study are substantial for automated visual inspection systems, especially in infrastructure monitoring. Real-world applications include detecting deteriorations in civil infrastructure like bridges and roads, where frequent manual inspection is impractical.

Theoretically, this paper adds to the discourse on multi-task learning, demonstrating how intricately connected visual tasks can mutually benefit from a unified learning process, especially under real-world degradation effects.

Future considerations might explore extending this framework to include video sequences, thus leveraging temporal data for improved SR and segmentation results. Moreover, addressing even more severe degradations and testing on a broader array of real-world conditions can further elevate the applicability of this research.

In summary, this paper presents a robust and practical approach to handling an intricate problem that combines advanced techniques in super-resolution and segmentation to tackle real-world challenges in automated inspection and monitoring.

Markdown Report Issue