FAST-LIVO2: Fast, Direct LiDAR-Inertial-Visual Odometry

Published 26 Aug 2024 in cs.RO and cs.CV | (2408.14035v2)

Abstract: This paper proposes FAST-LIVO2: a fast, direct LiDAR-inertial-visual odometry framework to achieve accurate and robust state estimation in SLAM tasks and provide great potential in real-time, onboard robotic applications. FAST-LIVO2 fuses the IMU, LiDAR and image measurements efficiently through an ESIKF. To address the dimension mismatch between the heterogeneous LiDAR and image measurements, we use a sequential update strategy in the Kalman filter. To enhance the efficiency, we use direct methods for both the visual and LiDAR fusion, where the LiDAR module registers raw points without extracting edge or plane features and the visual module minimizes direct photometric errors without extracting ORB or FAST corner features. The fusion of both visual and LiDAR measurements is based on a single unified voxel map where the LiDAR module constructs the geometric structure for registering new LiDAR scans and the visual module attaches image patches to the LiDAR points. To enhance the accuracy of image alignment, we use plane priors from the LiDAR points in the voxel map (and even refine the plane prior) and update the reference patch dynamically after new images are aligned. Furthermore, to enhance the robustness of image alignment, FAST-LIVO2 employs an on-demanding raycast operation and estimates the image exposure time in real time. Lastly, we detail three applications of FAST-LIVO2: UAV onboard navigation demonstrating the system's computation efficiency for real-time onboard navigation, airborne mapping showcasing the system's mapping accuracy, and 3D model rendering (mesh-based and NeRF-based) underscoring the suitability of our reconstructed dense map for subsequent rendering tasks. We open source our code, dataset and application on GitHub to benefit the robotics community.

Abstract PDF HTML Upgrade to Chat

Citations (4)

View on Semantic Scholar

Summary

The paper introduces FAST-LIVO2, a framework that fuses LiDAR, inertial, and visual data using a sequential ESIKF to enhance real-time SLAM efficiency.
It employs direct methods to process raw sensor data, reducing computational overhead and achieving precise alignment even in textureless or degenerated environments.
Validated on diverse datasets, FAST-LIVO2 demonstrates robust state estimation with minimal drift, making it ideal for autonomous navigation applications.

An Expert Overview of FAST-LIVO2: Fast, Direct LiDAR-Inertial-Visual Odometry

The paper introduces FAST-LIVO2, a novel framework that combines LiDAR, inertial, and visual measurements to address challenges in Simultaneous Localization and Mapping (SLAM) tasks, particularly in real-time applications. FAST-LIVO2 embeds these measurements within a sequentially updated Error-State Iterated Kalman Filter (ESIKF), a methodological choice that reflects the authors' emphasis on achieving both computational efficiency and enhanced robustness in diverse environments.

Technical Architecture and Methodology

FAST-LIVO2 utilizes an integrated approach to fuse data from LiDAR, inertial measurement units (IMUs), and cameras. The framework’s architecture transitions through three fundamental steps: state propagation, scan recombination, and sequential state update. During the scan recombination phase, LiDAR data, usually sampled at high frequencies, are bundled into consistent scans synchronized with camera frames. This synchronization supports the system's efficient data handling, a currency necessary for concurrent onboard processing tasks. The sequential update strategy in the ESIKF underlines a thoughtful design choice; it sidesteps potential dimensional mismatches between heterogeneous sensor data by segregating LiDAR and visual updates into a sequential fusion mechanism.

Key Features and Contributions

The system’s architecture differentiates itself from conventional ESIKF frameworks by adopting a sequential update strategy. This innovation accommodates the disparate data structures associated with LiDAR and image measurements. By employing direct methods during these updates, the framework processes raw image patches and LiDAR points, reducing redundant computational expenses inherent in feature extraction methods prevalent in other SLAM frameworks.

FAST-LIVO2’s computational efficiency is marked by its reduced processing time, which does not come at the expense of accuracy. Sequentially updating the state vector with the visual and LiDAR data through an iterated Kalman filter enhances the precision of state estimation markedly. This approach mitigates the computational burdens by virtue of a single unified voxel map employing plane priors for image alignment and dense reconstruction. Moreover, it contributes discernible advantages in scenarios where feature scarcity—often encountered in structured or textureless environments—impedes performance.

System Validation and Implications

The system’s performance is examined using extensive datasets, including those characterized by distinctive challenges such as heavy-lighting variations and LiDAR degenerations. Its application in autonomous UAV flights where it facilitates onboard navigation while conducting real-time mapping underscores the practicality of FAST-LIVO2's design.

In contrast to its predecessors and peers, this framework demonstrates notable performance in accuracy, stability, and real-time computational efficiency, positioning it as a desirable choice for large-scale autonomous navigation applications. Notably, the tests validate the system’s ability to navigate texture-less and LiDAR-degenerated environments without significant end-to-end drift. Each module’s efficiency and effectiveness are subjected to rigorous ablation studies, reinforcing the robustness claims.

Future Prospects and Areas for Development

While FAST-LIVO2 incorporates exposure-time estimation and brings computational advancements through its voxel map structure, the framework—like other odometry systems—potentially suffers from cumulative drift over extended durations or sceneries. Addressing this drift through integration with loop closure and optimization in a sliding window framework represents a prospective avenue for improving the framework's accuracy over long-term missions. Further, the dense, colored maps generated can be leveraged for object-level semantic mapping, thus expanding the application horizons of FAST-LIVO2 beyond conventional SLAM challenges toward more complex robotic and AI environments.

FAST-LIVO2 stands as a progressive step in LiDAR-inertial-visual SLAM research, underpinning a solid foundation for future advances in real-world robotic applications. Through its innovative methodology and robust handling of sensor fusion, it charts new territories in efficient and precise SLAM solutions.