Event-guided Deblurring of Unknown Exposure Time Videos

Published 13 Dec 2021 in cs.CV | (2112.06988v3)

Abstract: Motion deblurring is a highly ill-posed problem due to the loss of motion information in the blur degradation process. Since event cameras can capture apparent motion with a high temporal resolution, several attempts have explored the potential of events for guiding deblurring. These methods generally assume that the exposure time is the same as the reciprocal of the video frame rate. However, this is not true in real situations, and the exposure time might be unknown and dynamically varies depending on the video shooting environment(e.g., illumination condition). In this paper, we address the event-guided motion deblurring assuming dynamically variable unknown exposure time of the frame-based camera. To this end, we first derive a new formulation for event-guided motion deblurring by considering the exposure and readout time in the video frame acquisition process. We then propose a novel end-to-end learning framework for event-guided motion deblurring. In particular, we design a novel Exposure Time-based Event Selection(ETES) module to selectively use event features by estimating the cross-modal correlation between the features from blurred frames and the events. Moreover, we propose a feature fusion module to fuse the selected features from events and blur frames effectively. We conduct extensive experiments on various datasets and demonstrate that our method achieves state-of-the-art performance.

Abstract PDF Upgrade to Chat

Citations (25)

View on Semantic Scholar

Summary

The paper introduces an end-to-end framework that integrates event and frame features to recover sharp images from videos with unknown exposure times.
The approach utilizes a dynamic feature fusion module with Exposure Time-based Event Selection, achieving up to 6.40 dB PSNR improvement over prior methods.
The paper demonstrates state-of-the-art performance on both synthetic and real-world datasets, effectively restoring intricate textures under severe motion blur.

Event-guided Deblurring of Unknown Exposure Time Videos

Introduction

The problem of motion deblurring involves recovering sharp images from motion-blurred frames, which arises due to the integration of scene information over time during camera exposure. Traditional methods, especially those leveraging deep learning (DL), have largely assumed fixed exposure times, leading to limitations in real-world scenarios where exposure times can vary dynamically depending on environmental conditions such as lighting. This paper addresses the gap by introducing a novel framework for event-guided motion deblurring, which operates under the assumption of dynamically variable and unknown exposure times.

Methodology

The primary contribution of the paper is the development of an end-to-end learning framework tailored to handle motion deblurring, leveraging the high temporal resolution of event cameras. The framework introduces an Exposure Time-based Event Selection (ETES) module to efficiently utilize event features by estimating their correlation with blurred frame features.

The key components of the proposed method include:

Event Selection: Unlike prior methods assuming a fixed relationship between exposure time and video frame rate, the paper addresses the problem by estimating temporal correlations to infer exposure phases. The goal is to filter and employ events strictly corresponding to unknown exposure durations.
Feature Fusion Module: A novel fusion mechanism is proposed to integrate selected event features with frame features, utilizing dynamic convolutions to enhance feature representations. This approach exploits the complementary nature of texture details present in frames and motion cues captured by events.
Dataset Generation: A new dataset is created using real-world events captured with DAVIS-346 cameras, simulating varying exposure times to train and validate the model, thereby enhancing its adaptability in realistic environments.
Figure 1: Overview of the proposed framework. For the encoder, blue, yellow, and green boxes represent an event encoder for the past part, a shared RNN-based event encoder for the current part, and a blur-frame encoder, respectively.

Experimental Results

The proposed method was extensively evaluated on both synthetic and real-world datasets, demonstrating superior performance in comparison to both existing frame-based and event-guided deblurring methods. The approach achieves state-of-the-art results across various benchmarks, marked by significant improvements in PSNR and SSIM metrics.

Synthetic Datasets: The method shows a substantial performance leap over prior models, particularly in challenging scenarios with extreme motion blur. The average improvement in PSNR ranges from 1.66 dB to 6.40 dB against other event-guided techniques.
Real-world Datasets: On datasets specifically collected for this study, the framework outperforms existing methods by a substantial margin, showcasing its robustness to unsteady motion and varying exposure times. Notably, the proposed method restores fine details in heavily blurred images where others fail.
Qualitative Evaluation: The paper provides qualitative comparisons indicating the model's capacity to restore intricate textures and edges, even under severe non-linear motion conditions.
Figure 2: Visual comparison on the test split of real-world event datasets.

Implications and Future Work

The implications of this research are profound, as it sets a new baseline for motion deblurring in realistic settings where traditional assumptions about exposure times do not hold. Practically, this could enhance various applications ranging from video restoration to real-time processing in autonomous systems.

Theoretically, this work opens avenues for further exploration into cross-modal learning frameworks that adapt dynamically to temporal variations inherent in real-world environments. Future work could explore optimizing computational efficiency and exploring more complex modalities or combinations of sensory inputs to further improve deblurring performance.

Conclusion

The study proposes a cutting-edge approach to motion deblurring, leveraging the capabilities of event cameras to cope with unknown and dynamically varying exposure times. By addressing a critical limitation in traditional deblurring methods, the paper not only enhances performance but also broadens the applicability of DL-based solutions in practical, dynamic environments.

Markdown Report Issue