eKalibr-Stereo: Continuous-Time Spatiotemporal Calibration for Event-Based Stereo Visual Systems

Published 6 Apr 2025 in cs.RO | (2504.04451v1)

Abstract: The bioinspired event camera, distinguished by its exceptional temporal resolution, high dynamic range, and low power consumption, has been extensively studied in recent years for motion estimation, robotic perception, and object detection. In ego-motion estimation, the stereo event camera setup is commonly adopted due to its direct scale perception and depth recovery. For optimal stereo visual fusion, accurate spatiotemporal (extrinsic and temporal) calibration is required. Considering that few stereo visual calibrators orienting to event cameras exist, based on our previous work eKalibr (an event camera intrinsic calibrator), we propose eKalibr-Stereo for accurate spatiotemporal calibration of event-based stereo visual systems. To improve the continuity of grid pattern tracking, building upon the grid pattern recognition method in eKalibr, an additional motion prior-based tracking module is designed in eKalibr-Stereo to track incomplete grid patterns. Based on tracked grid patterns, a two-step initialization procedure is performed to recover initial guesses of piece-wise B-splines and spatiotemporal parameters, followed by a continuous-time batch bundle adjustment to refine the initialized states to optimal ones. The results of extensive real-world experiments show that eKalibr-Stereo can achieve accurate event-based stereo spatiotemporal calibration. The implementation of eKalibr-Stereo is open-sourced at (https://github.com/Unsigned-Long/eKalibr) to benefit the research community.

Abstract PDF Upgrade to Chat

Authors (3)

Summary

Overview of eKalibr-Stereo for Event-Based Stereo Calibration

The paper "eKalibr-Stereo: Continuous-Time Spatiotemporal Calibration for Event-Based Stereo Visual Systems" introduces an advanced approach for calibrating stereo event camera systems. These systems, inspired by the human visual system, offer a distinct advantage due to their low latency and high dynamic range. In ego-motion estimation and related applications, stereo event cameras are conventionally used for direct scale perception and depth recovery. Accurate calibration of such systems is crucial, particularly concerning their spatiotemporal parameters — the extrinsic camera positions and orientations and the time offset between them.

Methodology

eKalibr-Stereo builds upon the authors' previous work, eKalibr, by enhancing grid pattern recognition with an additional module for tracking incomplete patterns. This enhances the continuity of motion tracking and further facilitates the two-step initialization process for calibration. The tracking process initially employs normal flow estimation on the surface of active events to establish circle centers, forming the basis for grid pattern recognition. However, recognizing the challenge of incomplete pattern detection, this work adds a motion prior-based tracking module. This module predicts grid circle locations using three-point Lagrange polynomials, improving tracking continuity.

Following pattern tracking, a two-step initialization involves continuous-time trajectory initialization and spatiotemporal parameter estimation through continuous-time hand-eye alignment. These processes employ B-spline representations of time-varying states, enabling efficient parameter inference despite the asynchronous nature of event streams. Subsequently, the final optimization leverages continuous-time bundle adjustment techniques to refine state estimates.

Experimental Evaluation

Extensive real-world experiments demonstrated the effectiveness of the proposed method. The results showed that eKalibr-Stereo achieves high accuracy in both extrinsic and temporal calibration, comparable to conventional frame-based methods. Specifically, grid tracking rates increased by approximately 30% due to the additional incomplete grid tracking module. Moreover, the calibration error distribution followed a near-zero mean Gaussian distribution, indicating unbiased and precise calibration.

Additionally, computational efficiency was evaluated, revealing that most computation time is dedicated to grid pattern extraction and tracking, with the method averaging around 5.5 minutes for stereo camera calibration. This highlights its practicality for real-world applications, such as robotic perception systems requiring frequent re-calibration.

Implications and Future Work

The successful implementation of eKalibr-Stereo suggests significant implications for the field of event-based vision systems. By utilizing continuous-time optimization with direct event input, this method sets a precedent for efficient, accurate calibration of complex sensor suites without conventional intensity images.

Future work could focus on further improving the robustness and computational efficiency of the method. As event camera technology matures, exploring novel applications in dynamic environments, such as autonomous vehicles and drone systems, could be valuable. The integration of additional sensors and the extension of this method to multi-camera setups may also enhance its applicability and operational scope in real-world scenarios.

In conclusion, eKalibr-Stereo represents a notable advancement in event-based stereo system calibration, promising enhanced performance in dynamic visual perception tasks. It not only strengthens the theoretical foundations for continuous-time calibration but also opens up avenues for practical implementations in state-of-the-art robotic systems.