Papers
Topics
Authors
Recent
Search
2000 character limit reached

RMGS-SLAM: Real-time Multi-sensor Gaussian Splatting SLAM

Published 14 Apr 2026 in cs.RO | (2604.12942v1)

Abstract: Real-time 3D Gaussian splatting (3DGS)-based Simultaneous Localization and Mapping (SLAM) in large-scale real-world environments remains challenging, as existing methods often struggle to jointly achieve low-latency pose estimation, 3D Gaussian reconstruction in step with incoming sensor streams, and long-term global consistency. In this paper, we present a tightly coupled LiDAR-Inertial-Visual (LIV) 3DGS-based SLAM framework for real-time pose estimation and photorealistic mapping in large-scale real-world scenes. The system executes state estimation and 3D Gaussian primitive initialization in parallel with global Gaussian optimization, thereby enabling continuous dense mapping. To improve Gaussian initialization quality and accelerate optimization convergence, we introduce a cascaded strategy that combines feed-forward predictions with voxel-based principal component analysis (voxel-PCA) geometric priors. To enhance global consistency in large scenes, we further perform loop closure directly on the optimized global Gaussian map by estimating loop constraints through Gaussian-based Generalized Iterative Closest Point (GICP) registration, followed by pose-graph optimization. In addition, we collected challenging large-scale looped outdoor SLAM sequences with hardware-synchronized LiDAR-camera-IMU and ground-truth trajectories to support realistic and comprehensive evaluation. Extensive experiments on both public datasets and our dataset demonstrate that the proposed method achieves a strong balance among real-time efficiency, localization accuracy, and rendering quality across diverse and challenging real-world scenes.

Summary

  • The paper introduces a novel SLAM framework that leverages multi-sensor fusion and 3D Gaussian splatting for real-time, photorealistic mapping.
  • It employs a cascaded initialization strategy combining feed-forward predictions with voxel-PCA geometric priors to ensure high-fidelity reconstruction.
  • Empirical results demonstrate superior rendering performance, localization accuracy, and efficient runtime compared to leading baseline methods.

RMGS-SLAM: Real-time Multi-sensor Gaussian Splatting SLAM

System Architecture and Methodological Advances

RMGS-SLAM introduces a real-time, tightly-coupled 3D Gaussian Splatting-based SLAM framework leveraging LiDAR-Inertial-Visual (LIV) fusion for large-scale photorealistic mapping and robust localization. The system consists of four key modules: (i) LIV front-end for ego-motion and pose-synchronized data generation; (ii) a cascaded strategy for initialization of 3D Gaussian primitives, fusing feed-forward prediction with voxel-PCA-based geometric priors; (iii) global Gaussian map optimization under photometric, structural, and geometric metrics; and (iv) a Gaussian-based loop closure mechanism using GICP followed by pose-graph optimization. Figure 1

Figure 1: Overview of the four-module RMGS-SLAM system: LIV front-end, cascaded 3D Gaussian primitive initialization, asynchronous optimization, and Gaussian-based loop closure.

The utilization of a voxel-PCA geometric fitting process for geometric priors addresses the challenge of anisotropic and heterogeneous structures in large-scale real-world scenes, providing meaningful descriptors for robust initialization. The full system is highly parallelized, with state estimation and initialization running in parallel with continuous global optimization, thereby sustaining real-time constraints in dynamic and high-throughput settings.

Cascaded Initialization for High-Fidelity Reconstruction

A significant innovation lies in the cascaded initialization protocol for 3D Gaussian primitives. For every keyframe, 3D points are aggregated and primitive attributes (scale, rotation, opacity, spherical harmonics) are initialized by prioritizing feed-forward model predictions when reliable, otherwise falling back to voxel-PCA geometric priors or a depth-based isotropic heuristic. Figure 2

Figure 2: The cascaded initialization maps each point to the best-available prior (feed-forward, PCA, or heuristic), projects and interpolates attributes, and transforms them to the global frame.

This strategy ensures both rapid convergence during optimization and higher reliability in the presence of sensor or perception uncertainty. The use of model-based priors (e.g., from G3Splat) is shown empirically to outperform alternatives such as DepthSplat both qualitatively and quantitatively.

Direct Gaussian-based Loop Closure

A core contribution is the introduction of direct loop closure on the global Gaussian map. Candidate history-loop pairs are identified by spatio-temporal analysis and their relative poses are estimated via a Gaussian-adapted GICP. Loop closure constraints are then incorporated in a pose graph jointly with LIV odometry, updating associated primitive poses and view supervision to propagate correction throughout the map. The system regularizes primitive covariances for robustness and maintains dense, consistent reconstructions during multi-loop traversals.

Empirical Evaluation

RMGS-SLAM is evaluated against representative Gaussian-SLAM and LIV-SLAM baselines, including MonoGS, SplaTAM, GS-LIVM, Gaussian-LIC2, and FAST-LIVO2, over indoor and (importantly) challenging large-scale looped outdoor trajectories using both public and new custom benchmarks with precise ground truth. Key findings:

  • Rendering Performance: RMGS-SLAM achieves the highest or near-highest PSNR, SSIM, and lowest LPIPS across all evaluated sequences, consistently outperforming alternatives in both small and large-scale settings, including under challenging real-world sensor conditions.
  • Localization Accuracy: RMGS-SLAM yields the lowest ATE-RMSE on all evaluated sequences, with a substantial reduction in accumulated trajectory drift (e.g., 0.41 m0.41\,\mathrm{m} on Driving1 vs 3.60 m3.60\,\mathrm{m} for FAST-LIVO2 and worse for others). Figure 3

    Figure 4: Trajectory comparison for the Driving1 sequence: RMGS-SLAM demonstrates markedly superior global trajectory closure and geometric consistency.

    Figure 5

    Figure 3: Module-wise runtime analysis reveals efficient parallelization and stable real-time performance, even as loop closure becomes dominant at large scales.

  • Runtime: The system achieves real-time factors close to unity on all but the largest scenes, and maintains full-pipeline operation at sensor-rate, unlike many baselines that fail to complete long trajectories within time constraints.
  • Ablation: Both voxel-PCA priors and Gaussian loop closure are critical; disabling either leads to measurable degradation in rendering and pose accuracy.

Qualitative Analysis of Map Quality

The visual comparison reveals superior sharpness and consistency, especially after revisiting previously mapped areas, attributable to effective loop closure and robust initialization. RMGS-SLAM produces cohere photorealistic renderings with faithful view-dependent effects and suppresses trajectory and structural drift seen in all competing methods.

Implications, Limitations, and Future Directions

Practically, enhanced real-time photorealistic mapping and robust long-term localization facilitate downstream applications such as autonomous navigation and manipulation in complex environments, as evidenced by high-fidelity map quality and resilient global consistency. Theoretically, the integration of generalizable feed-forward models with multi-modal data fusion sets a precedent for scalable, appearance-aware mapping architectures. The direct loop closure over Gaussian primitives opens avenues for global optimization within learned scene representations, with implications for both SLAM and neural rendering.

Limitations include the computational load of global optimization and loop closure, particularly on memory-constrained hardware and extremely large-scale multi-loop scenarios, potentially limiting deployment without further system co-design. Extension to dynamic or non-static environments, integration with semantic priors, and hardware-efficient algorithm design remain fertile directions.

Conclusion

RMGS-SLAM advances the state of the art in real-time multi-sensor Gaussian Splatting SLAM by unifying efficient, quality-driven primitive initialization, asynchronous global optimization, and direct Gaussian-based loop closure. The empirical evaluations demonstrate robust real-time performance, superior localization, and map quality over prior methods. This framework provides a promising foundation for scalable, appearance-aware scene representations in autonomous robotics, with direct extensibility toward downstream perception and planning tasks.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.