SchurVINS: Schur Complement-Based Lightweight Visual Inertial Navigation System

Published 4 Dec 2023 in cs.CV and cs.RO | (2312.01616v6)

Abstract: Accuracy and computational efficiency are the most important metrics to Visual Inertial Navigation System (VINS). The existing VINS algorithms with either high accuracy or low computational complexity, are difficult to provide the high precision localization in resource-constrained devices. To this end, we propose a novel filter-based VINS framework named SchurVINS, which could guarantee both high accuracy by building a complete residual model and low computational complexity with Schur complement. Technically, we first formulate the full residual model where Gradient, Hessian and observation covariance are explicitly modeled. Then Schur complement is employed to decompose the full model into ego-motion residual model and landmark residual model. Finally, Extended Kalman Filter (EKF) update is implemented in these two models with high efficiency. Experiments on EuRoC and TUM-VI datasets show that our method notably outperforms state-of-the-art (SOTA) methods in both accuracy and computational complexity. The experimental code of SchurVINS is available at https://github.com/bytedance/SchurVINS.

Abstract PDF HTML Upgrade to Chat

Citations (2)

View on Semantic Scholar

Summary

The paper introduces a novel Schur complement decomposition that balances high-precision localization with low computational overhead.
It employs an EKF-based landmark solver to effectively fuse camera and IMU data, outperforming state-of-the-art methods on EuRoC and TUM-VI datasets.
The framework is optimized for resource-constrained devices, enabling efficient visual inertial navigation in mobile applications and MAVs.

The paper "SchurVINS: Schur Complement-Based Lightweight Visual Inertial Navigation System," authored by Yunfei Fan, Tianyu Zhao, and Guidong Wang, proposes an innovative filter-based Visual Inertial Navigation System (VINS) that effectively balances high accuracy and low computational complexity. This balance is particularly crucial for deployment in resource-constrained devices such as smartphones and micro aerial vehicles (MAVs).

Introduction

Visual Inertial Navigation Systems (VINS) leverage cameras and inertial measurement units (IMUs) to provide six-degree-of-freedom (6-DOF) positioning. These systems offer an appealing alternative to more expensive sensors such as Lidar due to their cost-effectiveness and ability to be integrated into portable devices. However, traditional VINS algorithms struggle to simultaneously achieve high precision and computational efficiency. Optimization-based methods typically provide high-accuracy localization but suffer from high computational complexity, while filter-based methods offer higher efficiency at the cost of reduced accuracy.

SchurVINS Framework

SchurVINS seeks to bridge this performance gap by employing a Schur complement-based decomposition to create an efficient and accurate VINS framework. The proposed system integrates the strengths of both optimization-based and filter-based methods.

Key Contributions

The main contributions of the paper are as follows:

Equivalent Residual Model: The authors introduce an equivalent residual model encompassing gradient, Hessian, and observation covariance. This model is designed to handle hyper high-dimensional observations efficiently.
EKF-Based Landmark Solver: A novel lightweight Extended Kalman Filter (EKF)-based solver is developed to estimate landmark positions with high efficiency.
High-Efficiency VINS Framework: The SchurVINS framework utilizes the Schur complement to decompose the full residual model into ego-motion and landmark residual models, subsequently allowing efficient EKF updates.

Experimental Evaluation

SchurVINS was evaluated on the EuRoC and TUM-VI datasets, demonstrating superior performance in accuracy and computational efficiency compared to state-of-the-art (SOTA) VINS methods.

Accuracy

On the EuRoC dataset, SchurVINS outperformed several filter-based and optimization-based methods in terms of accuracy, achieving an average RMSE that is the lowest among the filter-based methods and competitive with many optimization-based approaches.
Similar results were observed on the TUM-VI dataset, where SchurVINS exhibited superior accuracy compared to existing filter-based methods.

Computational Efficiency

SchurVINS required significantly less computational resources than several optimization-based methods such as BASALT and DM-VIO, while maintaining comparable accuracy.
The EKF-based landmark solver in SchurVINS led to noticeable improvements in efficiency compared to traditional optimization methods. The detailed runtime analysis revealed that the proposed method incurs lower overhead in key computational modules compared to SVO2.0 and OpenVINS.

Implications and Future Work

The combination of Schur complement and EKF facilitates an efficient and accurate VINS framework, making SchurVINS well-suited for resource-constrained environments. Practically, this could enhance the use of VINS in mobile devices and MAVs where computational resources and power are limited. The theoretical implications underline the potential of leveraging sparsity and decomposition techniques in high-dimensional state estimation problems.

In future research, the authors suggest focusing on local map refinement within SchurVINS to further enhance accuracy. This ongoing development will likely solidify SchurVINS' practicality and reliability in varied applications, including robotics, augmented reality (AR), and virtual reality (VR).

Conclusion

SchurVINS presents a balanced solution to the trade-off between accuracy and efficiency in VINS algorithms. By utilizing a Schur complement-based decomposition, it achieves high-precision localization while significantly reducing computational overhead. The framework's ability to outperform or match current SOTA methods in both accuracy and efficiency underscores its potential impact on the field of visual-inertial navigation systems.