Papers
Topics
Authors
Recent
Search
2000 character limit reached

coVoxSLAM: GPU Accelerated Globally Consistent Dense SLAM

Published 28 Oct 2024 in cs.RO | (2410.21149v1)

Abstract: A dense SLAM system is essential for mobile robots, as it provides localization and allows navigation, path planning, obstacle avoidance, and decision-making in unstructured environments. Due to increasing computational demands the use of GPUs in dense SLAM is expanding. In this work, we present coVoxSLAM, a novel GPU-accelerated volumetric SLAM system that takes full advantage of the parallel processing power of the GPU to build globally consistent maps even in large-scale environments. It was deployed on different platforms (discrete and embedded GPU) and compared with the state of the art. The results obtained using public datasets show that coVoxSLAM delivers a significant performance improvement considering execution times while maintaining accurate localization. The presented system is available as open-source on GitHub https://github.com/lrse-uba/coVoxSLAM.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. R. Mur-Artal and J. D. Tardós, “Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras,” IEEE transactions on robotics, vol. 33, no. 5, pp. 1255–1262, 2017.
  2. M. Labbé and F. Michaud, “Rtab-map as an open-source lidar and visual simultaneous localization and mapping library for large-scale and long-term online operation,” Journal of field robotics, vol. 36, no. 2, pp. 416–446, 2019.
  3. A. Cramariuc, L. Bernreiter, F. Tschopp, M. Fehr, V. Reijgwart, J. Nieto, R. Siegwart, and C. Cadena, “maplab 2.0–a modular and multi-modal mapping framework,” IEEE Robotics and Automation Letters, vol. 8, no. 2, pp. 520–527, 2022.
  4. R. A. Newcombe, S. J. Lovegrove, and A. J. Davison, “Dtam: Dense tracking and mapping in real-time,” in 2011 international conference on computer vision.   IEEE, 2011, pp. 2320–2327.
  5. C. Kerl, J. Sturm, and D. Cremers, “Dense visual slam for rgb-d cameras,” in 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.   IEEE, 2013, pp. 2100–2106.
  6. T. Whelan, S. Leutenegger, R. Salas-Moreno, B. Glocker, and A. Davison, “Elasticfusion: Dense slam without a pose graph.”   Robotics: Science and Systems, 2015.
  7. J. Czarnowski, T. Laidlow, R. Clark, and A. J. Davison, “Deepfactors: Real-time probabilistic dense monocular slam,” IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 721–728, 2020.
  8. L. Koestler, N. Yang, N. Zeller, and D. Cremers, “Tandem: Tracking and dense mapping in real-time using deep multi-view stereo,” in Conference on Robot Learning.   PMLR, 2022, pp. 34–45.
  9. X. Yang, H. Li, H. Zhai, Y. Ming, Y. Liu, and G. Zhang, “Vox-fusion: Dense tracking and mapping with voxel-based neural implicit representation,” in 2022 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).   IEEE, 2022, pp. 499–507.
  10. B. Curless and M. Levoy, “A volumetric method for building complex models from range images,” in Proceedings of the 23rd annual conference on Computer graphics and interactive techniques, 1996, pp. 303–312.
  11. R. A. Newcombe, S. Izadi, O. Hilliges, D. Molyneaux, D. Kim, A. J. Davison, P. Kohi, J. Shotton, S. Hodges, and A. Fitzgibbon, “Kinectfusion: Real-time dense surface mapping and tracking,” in 2011 10th IEEE international symposium on mixed and augmented reality.   Ieee, 2011, pp. 127–136.
  12. N. Fioraio, J. Taylor, A. Fitzgibbon, L. Di Stefano, and S. Izadi, “Large-scale and drift-free surface reconstruction using online subvolume registration,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 4475–4483.
  13. V. Reijgwart, A. Millane, H. Oleynikova, R. Siegwart, C. Cadena, and J. Nieto, “Voxgraph: Globally consistent, volumetric mapping using signed distance function submaps,” IEEE Robotics and Automation Letters, vol. 5, no. 1, pp. 227–234, 2019.
  14. H. Oleynikova, Z. Taylor, M. Fehr, R. Siegwart, and J. Nieto, “Voxblox: Incremental 3d euclidean signed distance fields for on-board mav planning,” in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2017, pp. 1366–1373.
  15. L. Han, F. Gao, B. Zhou, and S. Shen, “Fiesta: Fast incremental euclidean distance fields for online motion planning of aerial robots,” in 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2019, pp. 4423–4430.
  16. S. Aldegheri, N. Bombieri, D. D. Bloisi, and A. Farinelli, “Data flow orb-slam for real-time performance on embedded gpu boards,” in 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2019, pp. 5370–5375.
  17. R. Giubilato, S. Chiodini, M. Pertile, and S. Debei, “An evaluation of ros-compatible stereo visual slam methods on a nvidia jetson tx2,” Measurement, vol. 140, pp. 161–170, 2019.
  18. J. Song, J. Wang, L. Zhao, S. Huang, and G. Dissanayake, “Mis-slam: Real-time large-scale dense deformable slam system in minimal invasive surgery based on heterogeneous computing,” IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 4068–4075, 2018.
  19. T. Schops, T. Sattler, and M. Pollefeys, “Bad slam: Bundle adjusted direct rgb-d slam,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 134–144.
  20. J. Behley and C. Stachniss, “Efficient surfel-based slam using 3d laser range data in urban environments.” in Robotics: Science and Systems, vol. 2018, 2018, p. 59.
  21. K. Koide, M. Yokozuka, S. Oishi, and A. Banno, “Globally consistent 3d lidar mapping with gpu-accelerated gicp matching cost factors,” IEEE Robotics and Automation Letters, vol. 6, no. 4, pp. 8591–8598, 2021.
  22. A. Millane, H. Oleynikova, E. Wirbel, R. Steiner, V. Ramasamy, D. Tingdahl, and R. Siegwart, “nvblox: Gpu-accelerated incremental signed distance field mapping,” in 2024 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2024, pp. 2698–2705.
  23. S. Jubertie, I. Masliah, and J. Falcou, “Data layout and simd abstraction layers: decoupling interfaces from implementations,” in 2018 International Conference on High Performance Computing & Simulation (HPCS).   IEEE, 2018, pp. 531–538.
  24. H. Homann and F. Laenen, “Soax: A generic c++ structure of arrays for handling particles in hpc codes,” Computer Physics Communications, vol. 224, pp. 325–332, 2018.
  25. B. M. Gruber, G. Amadio, J. Blomer, A. Matthes, R. Widera, and M. Bussmann, “Llama: The low-level abstraction for memory access,” Software: Practice and Experience, vol. 53, no. 1, pp. 115–141, 2023.
  26. A. Hornung, K. M. Wurm, M. Bennewitz, C. Stachniss, and W. Burgard, “Octomap: An efficient probabilistic 3d mapping framework based on octrees,” Autonomous robots, vol. 34, pp. 189–206, 2013.
  27. C. Lauterbach, M. Garland, S. Sengupta, D. Luebke, and D. Manocha, “Fast bvh construction on gpus,” in Computer Graphics Forum, vol. 28, no. 2.   Wiley Online Library, 2009, pp. 375–384.
  28. J. L. Bentley, “K-d trees for semidynamic point sets,” in Proceedings of the sixth annual symposium on Computational geometry, 1990, pp. 187–197.
  29. R. B. Rusu and S. Cousins, “3d is here: Point cloud library (pcl),” in 2011 IEEE international conference on robotics and automation.   IEEE, 2011, pp. 1–4.
  30. K. Zhou, Q. Hou, R. Wang, and B. Guo, “Real-time kd-tree construction on graphics hardware,” ACM Transactions on Graphics (TOG), vol. 27, no. 5, pp. 1–11, 2008.
  31. I. Wald, S. Boulos, and P. Shirley, “Ray tracing deformable scenes using dynamic bounding volume hierarchies,” ACM Transactions on Graphics (TOG), vol. 26, no. 1, pp. 6–es, 2007.
  32. J. Gunther, S. Popov, H.-P. Seidel, and P. Slusallek, “Realtime ray tracing on gpu with bvh-based packet traversal,” in 2007 IEEE Symposium on Interactive Ray Tracing.   IEEE, 2007, pp. 113–118.
  33. D. A. Alcantara, A. Sharf, F. Abbasinejad, S. Sengupta, M. Mitzenmacher, J. D. Owens, and N. Amenta, “Real-time parallel hashing on the gpu,” in ACM SIGGRAPH asia 2009 papers, 2009, pp. 1–9.
  34. D. Jünger, R. Kobus, A. Müller, C. Hundt, K. Xu, W. Liu, and B. Schmidt, “Warpcore: A library for fast hash tables on gpus,” in 2020 IEEE 27th international conference on high performance computing, data, and analytics (HiPC).   IEEE, 2020, pp. 11–20.
  35. I. García, S. Lefebvre, S. Hornus, and A. Lasram, “Coherent parallel hashing,” ACM Transactions on Graphics (TOG), vol. 30, no. 6, pp. 1–8, 2011.
  36. W. Dong, Y. Lao, M. Kaess, and V. Koltun, “Ash: A modern framework for parallel spatial hashing in 3d perception,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 5, pp. 5417–5435, 2022.
  37. P. Stotko, “stdgpu: Efficient stl-like data structures on the gpu,” arXiv preprint arXiv:1908.05936, 2019.
  38. H. Moravec and A. Elfes, “High resolution maps from wide angle sonar,” in Proceedings. 1985 IEEE international conference on robotics and automation, vol. 2.   IEEE, 1985, pp. 116–121.
  39. A. Millane, Z. Taylor, H. Oleynikova, J. Nieto, R. Siegwart, and C. Cadena, “C-blox: A scalable and consistent tsdf-based dense mapping approach,” in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2018, pp. 995–1002.
  40. M. Klingensmith, I. Dryanovski, S. S. Srinivasa, and J. Xiao, “Chisel: Real time large scale 3d reconstruction onboard a mobile device using spatially hashed signed distance fields.” in Robotics: science and systems, vol. 4, no. 1, 2015.
  41. D. Werner, A. Al-Hamadi, and P. Werner, “Truncated signed distance function: experiments on voxel size,” in Image Analysis and Recognition: 11th International Conference, ICIAR 2014, Vilamoura, Portugal, October 22-24, 2014, Proceedings, Part II 11.   Springer, 2014, pp. 357–364.
  42. M. Splietker and S. Behnke, “Directional TSDF: Modeling surface orientation for coherent meshes,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2019, pp. 1727–1734.
  43. T.-T. Cao, K. Tang, A. Mohamed, and T.-S. Tan, “Parallel banding algorithm to compute exact distance transform with the gpu,” in Proceedings of the 2010 ACM SIGGRAPH symposium on Interactive 3D Graphics and Games, 2010, pp. 83–90.
  44. P. J. Besl and N. D. McKay, “Method for registration of 3-d shapes,” in Sensor fusion IV: control paradigms and data structures, vol. 1611.   Spie, 1992, pp. 586–606.
  45. S. Agarwal, K. Mierle, and T. C. S. Team, “Ceres Solver,” 10 2023. [Online]. Available: https://github.com/ceres-solver/ceres-solver
  46. J. Straub, T. Whelan, L. Ma, Y. Chen, E. Wijmans, S. Green, J. J. Engel, R. Mur-Artal, C. Ren, S. Verma, et al., “The replica dataset: A digital replica of indoor spaces,” arXiv preprint arXiv:1906.05797, 2019.
  47. J. Park, Q.-Y. Zhou, and V. Koltun, “Colored point cloud registration revisited,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 143–152.

Summary

  • The paper introduces a novel GPU-accelerated dense SLAM approach that achieves real-time, globally consistent mapping.
  • It employs optimized TSDF integration and GPU-based pose graph optimization, realizing speed-ups up to 140x over existing methods.
  • The system maintains high accuracy with low RMSE on large-scale LiDAR data, proving its practical efficiency for mobile robotics.

An Overview of the coVoxSLAM System: GPU-Accelerated Dense SLAM

The paper "coVoxSLAM: GPU Accelerated Globally Consistent Dense SLAM" by Emiliano Höss and Pablo De Crist introduces a novel approach to Simultaneous Localization and Mapping (SLAM) with a focus on enhancing computational efficiency through GPU acceleration. The coVoxSLAM system builds on existing SLAM methodologies by integrating GPU processing capabilities to achieve real-time map building in large-scale and complex environments, maintaining global consistency without sacrificing accuracy.

Introduction and Motivation

The authors articulate the critical role of dense SLAM systems in enabling mobile robots to navigate unstructured environments autonomously. SLAM systems estimate the trajectory of a robot while incrementally constructing a map of the observed area. However, traditional systems face challenges related to computational demands, especially when maintaining globally consistent and accurate maps over extended areas, requiring substantial resources for loop closing and pose estimation. The transition towards utilizing GPUs in dense SLAM responds to these increasing computational demands by leveraging their inherent parallel processing capabilities.

System Architecture

coVoxSLAM’s architecture, as detailed in the paper, consists of two primary components: the frontend and the backend, both fully optimized to harness GPU capabilities. The frontend focuses on integrating sensor data into a Truncated Signed Distance Field (TSDF) using improved raycasting techniques rather than traditional projection mapping. The transition to raycasting allows for a consistent computational workload distribution across GPU threads, thereby significantly enhancing efficiency. The backend involves processing pose graph optimization entirely on the GPU to refine submap alignment, leveraging iterative methods like Conjugate Gradient and Levenberg-Marquardt for cost function minimization.

Numerical Evaluation and Results

The authors conduct a thorough evaluation of coVoxSLAM compared to existing systems, such as Voxgraph and nvBlox. The numerical results highlight that coVoxSLAM demonstrates a marked improvement in execution times. For TSDF integration over large-scale LiDAR datasets, the system achieves speed-ups ranging from 30x to 140x compared to Voxblox. Notably, it is reported to outperform nvBlox with a performance increment of up to 2x in similar scenarios. Additionally, coVoxSLAM effectively manages to maintain accuracy levels akin to previous state-of-the-art methods, as evidenced by root mean square error (RMSE) comparisons against ground truth trajectories.

Implications and Future Directions

The introduction of coVoxSLAM contributes significantly to the SLAM domain by proving that complex volumetric mapping can meet the demands of real-time processing in larger and more intricate environments without the burdensome computational cost. The system's ability to function efficiently on both discrete and embedded GPU architectures highlights its adaptability and practical relevance, offering substantial utility for on-board applications in mobile robotics and beyond.

Looking ahead, the integration of deep learning techniques with coVoxSLAM presents an intriguing opportunity for further enhancing the quality and robustness of dense SLAM, especially in feature extraction and semantic mapping. Moreover, as GPU architectures continue to evolve, SLAM systems like coVoxSLAM could benefit from increased computational power and improved parallel processing models, paving the way for more advanced autonomous navigation solutions across diverse sectors.

In conclusion, the coVoxSLAM system delineated in this paper serves as a testament to the potential of GPU acceleration in addressing complex challenges in dense SLAM. The work sets a new benchmark in the domain, inviting further research to expand on its capabilities and explore its full potential in broader applications.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

GitHub

Tweets

Sign up for free to view the 1 tweet with 8 likes about this paper.