Papers
Topics
Authors
Recent
Search
2000 character limit reached

ActiveSplat: High-Fidelity Scene Reconstruction through Active Gaussian Splatting

Published 29 Oct 2024 in cs.RO and cs.CV | (2410.21955v2)

Abstract: We propose ActiveSplat, an autonomous high-fidelity reconstruction system leveraging Gaussian splatting. Taking advantage of efficient and realistic rendering, the system establishes a unified framework for online mapping, viewpoint selection, and path planning. The key to ActiveSplat is a hybrid map representation that integrates both dense information about the environment and a sparse abstraction of the workspace. Therefore, the system leverages sparse topology for efficient viewpoint sampling and path planning, while exploiting view-dependent dense prediction for viewpoint selection, facilitating efficient decision-making with promising accuracy and completeness. A hierarchical planning strategy based on the topological map is adopted to mitigate repetitive trajectories and improve local granularity given limited time budgets, ensuring high-fidelity reconstruction with photorealistic view synthesis. Extensive experiments and ablation studies validate the efficacy of the proposed method in terms of reconstruction accuracy, data coverage, and exploration efficiency. The released code will be available on our project page: https://li-yuetao.github.io/ActiveSplat/.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. V. Patil and M. Hutter, “Radiance fields for robotic teleoperation,” arXiv preprint arXiv:2407.20194, 2024.
  2. M. Torne, A. Simeonov, Z. Li, A. Chan, T. Chen, A. Gupta, and P. Agrawal, “Reconciling reality through simulation: A real-to-sim-to-real approach for robust manipulation,” arXiv preprint arXiv:2403.03949, 2024.
  3. B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, “Nerf: Representing scenes as neural radiance fields for view synthesis,” Communications of the ACM, vol. 65, no. 1, pp. 99–106, 2021.
  4. J. T. Barron, B. Mildenhall, M. Tancik, P. Hedman, R. Martin-Brualla, and P. P. Srinivasan, “Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields,” in Intl. Conf. on Computer Vision (ICCV), 2021, pp. 5855–5864.
  5. J. T. Barron, B. Mildenhall, D. Verbin, P. P. Srinivasan, and P. Hedman, “Mip-nerf 360: Unbounded anti-aliased neural radiance fields,” in IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 5470–5479.
  6. J. T. Barron, B. Mildenhall, D. Verbin, P. P. Srinivasan, and p. Hedman, “Zip-nerf: Anti-aliased grid-based neural radiance fields,” in Intl. Conf. on Computer Vision (ICCV), 2023, pp. 19 697–19 705.
  7. B. Kerbl, G. Kopanas, T. Leimkühler, and G. Drettakis, “3d gaussian splatting for real-time radiance field rendering,” ACM Trans. Graphics, vol. 42, no. 4, pp. 1–14, 2023.
  8. B. Huang, Z. Yu, A. Chen, A. Geiger, and S. Gao, “2d gaussian splatting for geometrically accurate radiance fields,” in SIGGRAPH.   Association for Computing Machinery, 2024.
  9. Z. Yu, A. Chen, B. Huang, T. Sattler, and A. Geiger, “Mip-splatting: Alias-free 3d gaussian splatting,” in IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 19 447–19 456.
  10. A. Guédon and V. Lepetit, “Sugar: Surface-aligned gaussian splatting for efficient 3d mesh reconstruction and high-quality mesh rendering,” in IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 5354–5363.
  11. K. Cheng, X. Long, K. Yang, Y. Yao, W. Yin, Y. Ma, W. Wang, and X. Chen, “Gaussianpro: 3d gaussian splatting with progressive propagation,” in Intl. Conf. on Machine Learning (ICML), 2024.
  12. G. Chen and W. Wang, “A survey on 3d gaussian splatting,” arXiv preprint arXiv:2401.03890, 2024.
  13. J. Zhang, J. Li, X. Yu, L. Huang, L. Gu, J. Zheng, and X. Bai, “Cor-gs: sparse-view 3d gaussian splatting via co-regularization,” in European Conference on Computer Vision.   Springer, 2025, pp. 335–352.
  14. Z. Kuang, Z. Yan, H. Zhao, G. Zhou, and H. Zha, “Active neural mapping at scale,” in IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems (IROS), 2024.
  15. B. Yamauchi, “A frontier-based approach for autonomous exploration,” in Proceedings 1997 IEEE International Symposium on Computational Intelligence in Robotics and Automation CIRA’97. ’Towards New Computational Principles for Robotics and Automation’.   IEEE, 1997, pp. 146–151.
  16. H. Umari and S. Mukhopadhyay, “Autonomous robotic exploration based on multiple rapidly-exploring randomized trees,” in IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems (IROS).   IEEE, 2017, pp. 1396–1402.
  17. L. Schmid, M. Pantic, R. Khanna, L. Ott, R. Siegwart, and J. Nieto, “An efficient sampling-based method for online informative path planning in unknown environments,” IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 1500–1507, 2020.
  18. Z. Yan, H. Yang, and H. Zha, “Active neural mapping,” in Intl. Conf. on Computer Vision (ICCV), 2023.
  19. C. Cao, H. Zhu, H. Choset, and J. Zhang, “Tare: A hierarchical framework for efficiently exploring complex 3d environments.” in Robotics: Science and Systems (RSS), vol. 5, 2021, p. 2.
  20. M. Selin, M. Tiger, D. Duberg, F. Heintz, and P. Jensfelt, “Efficient autonomous exploration planning of large-scale 3-d environments,” IEEE Robotics and Automation Letters, vol. 4, no. 2, pp. 1699–1706, 2019.
  21. E. Sucar, S. Liu, J. Ortiz, and A. J. Davison, “imap: Implicit mapping and positioning in real-time,” in Intl. Conf. on Computer Vision (ICCV), 2021, pp. 6229–6238.
  22. Z. Zhu, S. Peng, V. Larsson, W. Xu, H. Bao, Z. Cui, M. R. Oswald, and M. Pollefeys, “Nice-slam: Neural implicit scalable encoding for slam,” in IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 12 786–12 796.
  23. T. Müller, A. Evans, C. Schied, and A. Keller, “Instant neural graphics primitives with a multiresolution hash encoding,” ACM Trans. Graphics, vol. 41, no. 4, pp. 1–15, 2022.
  24. C. Jiang, H. Zhang, P. Liu, Z. Yu, H. Cheng, B. Zhou, and S. Shen, “H2-mapping: Real-time dense mapping using hierarchical hybrid representation,” IEEE Robotics and Automation Letters, 2023.
  25. C. Jiang, Y. Luo, B. Zhou, and S. Shen, “H3-mapping: Quasi-heterogeneous feature grids for real-time dense mapping using hierarchical hybrid representation,” arXiv preprint arXiv:2403.10821, 2024.
  26. N. Keetha, J. Karhade, K. M. Jatavallabhula, G. Yang, S. Scherer, D. Ramanan, and J. Luiten, “Splatam: Splat track & map 3d gaussians for dense rgb-d slam,” in IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 21 357–21 366.
  27. C. Yan, D. Qu, D. Xu, B. Zhao, Z. Wang, D. Wang, and X. Li, “Gs-slam: Dense visual slam with 3d gaussian splatting,” in IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 19 595–19 604.
  28. H. Matsuki, R. Murai, P. H. Kelly, and A. J. Davison, “Gaussian splatting slam,” in IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 18 039–18 048.
  29. J. Wei and S. Leutenegger, “Gsfusion: Online rgb-d mapping where gaussian splatting meets tsdf fusion,” 2024. [Online]. Available: https://arxiv.org/abs/2408.12677
  30. Z. Feng, H. Zhan, Z. Chen, Q. Yan, X. Xu, C. Cai, B. Li, Q. Zhu, and Y. Xu, “Naruto: Neural active reconstruction from uncertain target observations,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 21 572–21 583.
  31. X. Pan, Z. Lai, S. Song, and G. Huang, “Activenerf: Learning where to see with uncertainty estimation,” in European Conf. on Computer Vision (ECCV).   Springer, 2022, pp. 230–246.
  32. S. Lee, L. Chen, J. Wang, A. Liniger, S. Kumar, and F. Yu, “Uncertainty guided policy for active robotic 3d reconstruction using neural radiance fields,” IEEE Robotics and Automation Letters, vol. 7, no. 4, pp. 12 070–12 077, 2022.
  33. G. Liu, W. Jiang, B. Lei, V. Pandey, K. Daniilidis, and N. Motee, “Beyond uncertainty: Risk-aware active view acquisition for safe robot navigation and 3d scene understanding with fisherrf,” arXiv preprint arXiv:2403.11396, 2024.
  34. W. Jiang, B. Lei, and K. Daniilidis, “Fisherrf: Active view selection and uncertainty quantification for radiance fields using fisher information,” arXiv, 2023.
  35. R. Jin, Y. Gao, H. Lu, and F. Gao, “Gs-planner: A gaussian-splatting-based planning framework for active high-fidelity reconstruction,” arXiv preprint arXiv:2405.10142, 2024.
  36. Q. Du, V. Faber, and M. Gunzburger, “Centroidal voronoi tessellations: Applications and algorithms,” SIAM review, vol. 41, no. 4, pp. 637–676, 1999.
  37. J. Canny and B. Donald, “Simplified voronoi diagrams,” in Proceedings of the third annual symposium on Computational geometry, 1987, pp. 153–161.
  38. E. Schubert, J. Sander, M. Ester, H. P. Kriegel, and X. Xu, “Dbscan revisited, revisited: why and how you should (still) use dbscan,” ACM Transactions on Database Systems (TODS), vol. 42, no. 3, pp. 1–21, 2017.
  39. C. B. Barber, D. P. Dobkin, and H. Huhdanpaa, “The quickhull algorithm for convex hulls,” ACM Transactions on Mathematical Software (TOMS), vol. 22, no. 4, pp. 469–483, 1996.
  40. F. Murtagh and P. Contreras, “Algorithms for hierarchical clustering: an overview,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 2, no. 1, pp. 86–97, 2012.
  41. O. Arslan, D. P. Guralnik, and D. E. Koditschek, “Coordinated robot navigation via hierarchical clustering,” IEEE Trans. Robotics, vol. 32, no. 2, pp. 352–371, 2016.
  42. M. Savva, A. Kadian, O. Maksymets, Y. Zhao, E. Wijmans, B. Jain, J. Straub, J. Liu, V. Koltun, J. Malik et al., “Habitat: A platform for embodied ai research,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 9339–9347.
  43. F. Xia, A. R. Zamir, Z. He, A. Sax, J. Malik, and S. Savarese, “Gibson Env: real-world perception for embodied agents,” in Computer Vision and Pattern Recognition (CVPR), 2018 IEEE Conference on.   IEEE, 2018.
  44. A. Chang, A. Dai, T. Funkhouser, M. Halber, M. Niessner, M. Savva, S. Song, A. Zeng, and Y. Zhang, “Matterport3d: Learning from rgb-d data in indoor environments,” International Conference on 3D Vision (3DV), 2017.
  45. G. Georgakis, B. Bucher, A. Arapin, K. Schmeckpeper, N. Matni, and K. Daniilidis, “Uncertainty-driven planner for exploration and navigation,” in IEEE Intl. Conf. on Robotics and Automation (ICRA), 2022.

Summary

  • The paper introduces ActiveSplat, an autonomous system that leverages Gaussian splatting for high-accuracy, real-time 3D scene mapping.
  • It employs a hybrid map representation that fuses dense Gaussian primitives with a sparse Voronoi graph to enhance viewpoint selection and path planning.
  • Hierarchical planning and post-processing optimizations refine both local details and global exploration, offering significant potential for robotic and VR applications.

High-Fidelity Scene Reconstruction Through Active Gaussian Splatting

The paper "ActiveSplat: High-Fidelity Scene Reconstruction through Active Gaussian Splatting" introduces an innovative autonomous system designed to achieve high-fidelity 3D scene reconstructions. This system, known as ActiveSplat, leverages the Gaussian splatting technique to enhance online mapping, viewpoint selection, and path planning within a unified framework. The researchers have ingeniously integrated a hybrid map representation that fuses dense environmental data with a sparse abstraction of the workspace, thus enabling efficient and accurate decision-making.

Technical Overview

ActiveSplat is characterized by its use of Gaussian splatting for both online map updating and photorealistic view synthesis. This approach addresses the limitations of computational inefficiency and susceptibility to noise found in previous NeRF-based methods. By employing a hierarchical planning strategy rooted in a topological map representation, the system mitigates redundant trajectories and enhances local granularity, allowing it to efficiently navigate and explore unknown environments.

Key Features of ActiveSplat:

  1. Hybrid Map Representation: The system utilizes a dense representation of Gaussian primitives for accurate and detailed scene predictions. Simultaneously, it extracts a sparse Voronoi graph from the workspace to guide efficient path planning and viewpoint selection.
  2. Viewpoint Selection: The proposed decoupled approach to position and rotation in viewpoint selection ensures comprehensive exploration while minimizing redundant data capture. This strategy balances efficiency against the need for thorough scene coverage.
  3. Hierarchical Planning: By dynamically partitioning the workspace into subregions via a Voronoi graph and employing a dual-level planning strategy, the system maximizes exploration effectiveness. This approach ensures intricate local inspection while maintaining a broader global exploration strategy.
  4. Post-Processing Optimization: The system also allows for post-processing refinements, augmenting the photorealistic reconstruction quality by utilizing offline optimization techniques applied to stored keyframe data.

Theoretical and Practical Implications

The theoretical implications of this research are substantial, offering new insights into efficient scene reconstruction through the combination of explicit scene representations and hierarchical exploration strategies. The integration of Gaussian splatting to achieve real-time mapping and reconstruction presents a compelling approach that may redefine methodologies for 3D scene capture and photorealistic rendering.

Practically, ActiveSplat has potential applications in fields that require precise environmental modeling, such as autonomous navigation, robotic perception, and virtual reality simulations. The system can significantly improve the sim-to-real transferability by creating highly accurate digital twins of physical environments. This advancement could contribute to the development of autonomous systems capable of complex decision-making tasks within diverse and unpredictable real-world settings.

Future Developments

ActiveSplat opens several avenues for future research. Enhancements could focus on scaling the system for outdoor environments, integrating multimodal sensor data for richer scene understanding, or optimizing path planning algorithms for dynamic environments. Moreover, the possibility of extending the approach to other emerging technologies such as drone navigation or underwater exploration holds promise.

In summary, ActiveSplat presents a robust framework for scene reconstruction, advancing the field through its synthesis of Gaussian splatting and hierarchical planning. It holds significant potential to transform practical applications in various domains requiring high-accuracy environmental modeling. The system exemplifies the value of hybrid approaches in achieving a balance between computational efficiency and reconstruction fidelity.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

GitHub

Tweets

Sign up for free to view the 2 tweets with 32 likes about this paper.