Papers
Topics
Authors
Recent
Search
2000 character limit reached

IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model

Published 16 May 2024 in cs.CV and eess.IV | (2405.09873v2)

Abstract: Infrared image super-resolution demands long-range dependency modeling and multi-scale feature extraction to address challenges such as homogeneous backgrounds, weak edges, and sparse textures. While Mamba-based state-space models (SSMs) excel in global dependency modeling with linear complexity, their block-wise processing disrupts spatial consistency, limiting their effectiveness for IR image reconstruction. We propose IRSRMamba, a novel framework integrating wavelet transform feature modulation for multi-scale adaptation and an SSMs-based semantic consistency loss to restore fragmented contextual information. This design enhances global-local feature fusion, structural coherence, and fine-detail preservation while mitigating block-induced artifacts. Experiments on benchmark datasets demonstrate that IRSRMamba outperforms state-of-the-art methods in PSNR, SSIM, and perceptual quality. This work establishes Mamba-based architectures as a promising direction for high-fidelity IR image enhancement. Code are available at https://github.com/yongsongH/IRSRMamba.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. P. M. Harvey et al., “First science results from sofia/forcast: Super-resolution imaging of the s140 cluster at 37 μ𝜇\muitalic_μm,” The Astrophysical Journal Letters, vol. 749, no. 2, p. L20, 2012.
  2. S. Liang et al., “Dasr: Dual-attention transformer for infrared image super-resolution,” Infrared Physics & Technology, vol. 133, p. 104837, 2023.
  3. X. Chen et al., “Modeling thermal infrared image degradation and real-world super-resolution under background thermal noise and streak interference,” IEEE Transactions on Circuits and Systems for Video Technology, 2024.
  4. T. Ma et al., “Msma-net: An infrared small target detection network by multi-scale super-resolution enhancement and multi-level attention fusion,” IEEE Transactions on Geoscience and Remote Sensing, 2023.
  5. Y. Huang, Z. Jiang, R. Lan, S. Zhang, and K. Pi, “Infrared image super-resolution via transfer learning and psrgan,” IEEE Signal Processing Letters, vol. 28, pp. 982–986, 2021.
  6. Z. Jiang et al., “Difference value network for image super-resolution,” IEEE Signal Processing Letters, vol. 28, pp. 1070–1074, 2021.
  7. D. Zhang et al., “Joint motion deblurring and super-resolution for single image using diffusion model and gan,” IEEE Signal Processing Letters, 2024.
  8. Y. Chen et al., “Efficient multi-scale cosine attention transformer for image super-resolution,” IEEE Signal Processing Letters, 2023.
  9. B. Zhou et al., “Structure and texture preserving network for real-world image super-resolution,” IEEE Signal Processing Letters, vol. 29, pp. 2173–2177, 2022.
  10. H. Yongsong et al., “Infrared image super-resolution: Systematic review, and future trends,” arXiv preprint arXiv:2212.12322, 2023.
  11. Z. Zhao et al., “Modality conversion meets super-resolution: A collaborative framework for high-resolution thermal uav image generation,” IEEE Transactions on Geoscience and Remote Sensing, 2024.
  12. A. Gu and T. Dao, “Mamba: Linear-time sequence modeling with selective state spaces,” arXiv preprint arXiv:2312.00752, 2023.
  13. H. Guo et al., “Mambair: A simple baseline for image restoration with state-space model,” arXiv preprint arXiv:2402.15648, 2024.
  14. L. Zhu et al., “Vision mamba: Efficient visual representation learning with bidirectional state space model,” arXiv preprint arXiv:2401.09417, 2024.
  15. H. Zhang et al., “A survey on visual mamba,” arXiv preprint arXiv:2404.15956, 2024.
  16. X. Wang et al., “State space model for new-generation network alternative to transformers: A survey,” arXiv preprint arXiv:2404.09516, 2024.
  17. M. Xue et al., “Low-light image enhancement via clip-fourier guided wavelet diffusion,” ACM MM, 2024.
  18. H. Jiang et al., “Low-light image enhancement with wavelet-based diffusion models,” ACM Transactions on Graphics (TOG), vol. 42, no. 6, pp. 1–14, 2023.
  19. B. Liu et al., “Mwln: Multilevel wavelet learning network for continuous-scale remote sensing image super-resolution,” IEEE Geoscience and Remote Sensing Letters, 2023.
  20. Y. Liu et al., “Vmamba: Visual state space model,” arXiv preprint arXiv:2401.10166, 2024.
  21. J. Liu et al., “Target-aware dual adversarial learning and a multi-scenario multi-modality benchmark to fuse infrared and visible for object detection,” in Proceedings of the IEEE/CVF CVPR, pp. 5802–5811, 2022.
  22. Y. o. Liu, “Infrared and visible image fusion with convolutional neural networks,” INT J WAVELETS MULTI, vol. 16, no. 03, p. 1850018, 2018.
  23. Y. Zhang et al., “Infrared and visual image fusion through infrared feature extraction and visual information preservation,” INFRARED PHYS TECHN, vol. 83, pp. 227–237, 2017.
  24. F. B. Campo et al., “Multimodal stereo vision system: 3d data extraction and algorithm evaluation,” IEEE Journal of Selected Topics in Signal Processing, vol. 6, no. 5, pp. 437–446, 2012.
  25. B. Lim et al., “Enhanced deep residual networks for single image super-resolution,” in Proceedings of the IEEE CVPRW, pp. 136–144, 2017.
  26. X. Wang, K. Yu, S. Wu, J. Gu, Y. Liu, C. Dong, Y. Qiao, and C. Change Loy, “Esrgan: Enhanced super-resolution generative adversarial networks,” in Proceedings of the European conference on computer vision (ECCV) workshops, pp. 0–0, 2018.
  27. C. Dong et al., “Accelerating the super-resolution convolutional neural network,” in ECCV, pp. 391–407, Springer, 2016.
  28. C. Ledig et al., “Photo-realistic single image super-resolution using a generative adversarial network,” in Proceedings of the IEEE/CVF CVPR, pp. 4681–4690, 2017.
  29. J. Liang et al., “Swinir: Image restoration using swin transformer,” in Proceedings of the IEEE/CVF ICCV, pp. 1833–1844, 2021.
  30. C. Dong et al., “Image super-resolution using deep convolutional networks,” IEEE transactions on pattern analysis and machine intelligence, vol. 38, no. 2, pp. 295–307, 2015.
  31. Y. Zhang et al., “Image super-resolution using very deep residual channel attention networks,” in Proceedings of the ECCV, pp. 286–301, 2018.
  32. L. Sun et al., “Shufflemixer: An efficient convnet for image super-resolution,” Advances in Neural Information Processing Systems, vol. 35, pp. 17314–17326, 2022.
  33. X. Chen et al., “Activating more pixels in image super-resolution transformer,” in Proceedings of the IEEE/CVF CVPR, pp. 22367–22377, June 2023.
  34. Z. Chen, Y. Zhang, J. Gu, L. Kong, and X. Yang, “Recursive generalization transformer for image super-resolution,” in ICLR, 2024.
Citations (1)

Summary

  • The paper introduces IRSRMamba, merging Mamba-based state-space modeling with wavelet feature modulation to address long-range dependencies and enhance sparse detail restoration in infrared images.
  • Its innovative design significantly boosts PSNR, achieving 39.33 dB at ×2 scale on benchmarks and outperforming both traditional and modern state-of-the-art methods.
  • The model’s success paves the way for advanced state-space techniques in infrared imaging, with promising applications in security and planetary exploration.

IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model

The paper "IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model" by Yongsong Huang et al. investigates the challenges associated with infrared (IR) image super-resolution (SISR) and proposes a novel approach to address them using both the Mamba-based state-space model (SSM) and wavelet transform techniques. The work demonstrates IRSRMamba's superior performance over existing methodologies in terms of managing long-range dependencies and enhancing the restoration of sparse details inherent in IR imaging.

Methodological Innovations

The proposed IRSRMamba model tackles the inherent difficulties of IR image super-resolution by leveraging a Mamba-based backbone network, which originates from structured state-space models known for their application in continuous linear time-invariant systems. This backbone is expected to efficiently capture long-range dependencies of spatial data, addressing the uniformity and sparse detail challenges found in IR images. The novel integration of a Mamba-based approach into IR image processing is a critical advancement claimed by the authors.

Additionally, the authors incorporate a wavelet transform feature modulation block to facilitate multiscale feature representation. By transforming features into the frequency domain, this method enhances both local and global information capture. The modulation of feature maps through wavelet transformation combined with different convolution operations allows IRSRMamba to significantly improve scale-specific detail capture, enabling refined restoration of sparse patterns and contexts in IR images.

Strong Numerical Results and Evaluation

The paper's extensive experimental evaluation demonstrates the effectiveness of IRSRMamba across multiple benchmarks. It outperforms traditional and recent state-of-the-art methods, such as EDSR, ESRGAN, and SwinIR, reflecting significant improvements in key performance metrics including Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM). For instance, the PSNR improvement metrics highlighted in the ablation studies position IRSRMamba as a strong alternative for IR image super-resolution. Particularly, IRSRMamba achieves a PSNR of 39.33 dB with a scale factor of ×2 on the result-A dataset, outperforming other models and improving the restoration of fine features prevalent in IR imaging contexts.

Implications and Future Directions

The research presented in this paper offers several theoretical and practical implications. The introduction of IRSRMamba demonstrates that Mamba-based models can be successfully applied beyond their conventional domains, providing a robust framework for resolving long-range dependencies in complex image datasets like those seen in infrared imaging. The wavelet transform feature modulation block creates new avenues for feature extraction and enhancement, potentially inspiring further research in other image processing applications.

In terms of future work, further exploration of Mamba models in IR image enhancement and their integration with machine learning frameworks could extend these results. These advancements could significantly affect fields such as security and planetary exploration, where infrared imaging is pivotal. Additionally, the generalization ability across various datasets suggests intriguing possibilities for the model's application to other sensory data domains requiring enhanced resolution and detail restoration.

In conclusion, this paper contributes a novel IRSR methodology that effectively combines state-of-the-art approaches to address the domain-specific challenges of IR image enhancement. Its superior performance marks a significant step forward, positioning IRSRMamba as a notable tool for future developments in AI-driven image processing technologies.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 1 like about this paper.