EvGGS: A Collaborative Learning Framework for Event-based Generalizable Gaussian Splatting
Abstract: Event cameras offer promising advantages such as high dynamic range and low latency, making them well-suited for challenging lighting conditions and fast-moving scenarios. However, reconstructing 3D scenes from raw event streams is difficult because event data is sparse and does not carry absolute color information. To release its potential in 3D reconstruction, we propose the first event-based generalizable 3D reconstruction framework, called EvGGS, which reconstructs scenes as 3D Gaussians from only event input in a feedforward manner and can generalize to unseen cases without any retraining. This framework includes a depth estimation module, an intensity reconstruction module, and a Gaussian regression module. These submodules connect in a cascading manner, and we collaboratively train them with a designed joint loss to make them mutually promote. To facilitate related studies, we build a novel event-based 3D dataset with various material objects and calibrated labels of grayscale images, depth maps, camera poses, and silhouettes. Experiments show models that have jointly trained significantly outperform those trained individually. Our approach performs better than all baselines in reconstruction quality, and depth/intensity predictions with satisfactory rendering speed.
- Learning representations and generative models for 3d point clouds. In International conference on machine learning, pp. 40–49. PMLR, 2018.
- Spiking neural networks for frame-based and event-based single object localization. Neurocomputing, 559:126805, 2023.
- Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5855–5864, 2021.
- Mip-nerf 360: Unbounded anti-aliased neural radiance fields, 2022.
- Learning to estimate two dense depths from lidar and event data. In Scandinavian Conference on Image Analysis, pp. 517–533. Springer, 2023.
- Spade-e2vid: Spatially-adaptive denormalization for event-based video reconstruction. IEEE Transactions on Image Processing, 30:2488–2500, 2021.
- Hexplane: A fast representation for dynamic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 130–141, 2023.
- Visionblender: a tool to efficiently generate computer vision datasets for robotic surgery. Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, pp. 1–8, 2020.
- Mvsnerf: Fast generalizable radiance field reconstruction from multi-view stereo. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 14124–14133, 2021.
- Tensorf: Tensorial radiance fields. In European Conference on Computer Vision, pp. 333–350. Springer, 2022.
- Neurbf: A neural fields representation with adaptive radial basis functions. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4182–4194, 2023.
- Plenoxels: Radiance fields without neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5501–5510, 2022.
- Ref-neus: Ambiguity-reduced neural implicit surface learning for multi-view reconstruction with reflection. arXiv preprint arXiv:2303.10840, 2023.
- Learning monocular dense depth from events. In 2020 International Conference on 3D Vision (3DV), pp. 534–542. IEEE, 2020.
- v2e: From video frames to realistic dvs events. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1312–1321, 2021.
- Ponder: Point cloud pre-training via neural rendering. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 16089–16098, 2023.
- Ev-nerf: Event based neural radiance field. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 837–847, 2023.
- Stereo depth estimation based on adaptive stacks from event cameras. In IECON 2023-49th Annual Conference of the IEEE Industrial Electronics Society, pp. 1–6. IEEE, 2023.
- 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics, 42(4), 2023.
- E-nerf: Neural radiance fields from a moving event camera. IEEE Robotics and Automation Letters, 8(3):1587–1594, 2023.
- Efficient neural radiance fields for interactive free-viewpoint video. In SIGGRAPH Asia 2022 Conference Papers, pp. 1–9, 2022.
- Dynamic plane convolutional occupancy networks. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1829–1838, 2021.
- Sensing diversity and sparsity models for event generation and video reconstruction from events. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
- A general differentiable mesh renderer for image-based 3d reasoning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(1):50–62, 2020a.
- Dist: Rendering deep implicit signed distance function with differentiable sphere tracing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2019–2028, 2020b.
- Event-based monocular dense depth estimation with recurrent transformers. arXiv preprint arXiv:2212.02791, 2022a.
- Neural rays for occlusion-aware image-based rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7824–7833, 2022b.
- Neural volumes: Learning dynamic renderable volumes from images. arXiv preprint arXiv:1906.07751, 2019.
- Nerf: Representing scenes as neural radiance fields for view synthesis. In European conference on computer vision, 2020.
- Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021.
- Back to event basics: Self-supervised learning of image reconstruction for event cameras via photometric constancy. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3446–3455, 2021.
- High speed and high dynamic range video with an event camera. IEEE transactions on pattern analysis and machine intelligence, 43(6):1964–1980, 2019.
- Eventnerf: Neural radiance fields from a single colour event camera. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4992–5002, 2023.
- Fast image reconstruction with an event camera. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 156–163, 2020.
- Deepvoxels: Learning persistent 3d feature embeddings. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2437–2446, 2019.
- Direct voxel grid optimization: Super-fast convergence for radiance fields reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5459–5469, 2022.
- Ref-nerf: Structured view-dependent appearance for neural radiance fields. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5481–5490. IEEE, 2022.
- Event enhanced high-quality image recovery. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIII 16, pp. 155–171. Springer, 2020.
- Learning to generate and manipulate 3d radiance field by a hierarchical diffusion framework with clip latent. In COMPUTER GRAPHICS forum, volume 42, 2023.
- Physical priors augmented event-based 3d reconstruction, 2024a.
- Learning robust generalizable radiance field with visibility and feature augmented point representation. arXiv preprint arXiv:2401.14354, 2024b.
- Ibrnet: Learning multi-view image-based rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4690–4699, 2021.
- Event-based video reconstruction using transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2563–2572, 2021.
- Point-nerf: Point-based neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5438–5448, 2022.
- Neumesh: Learning disentangled neural mesh-based implicit field for geometry and texture editing. In European Conference on Computer Vision, pp. 597–614. Springer, 2022.
- The unreasonable effectiveness of deep features as a perceptual metric. In CVPR, 2018.
- Gps-gaussian: Generalizable pixel-wise 3d gaussian splatting for real-time human novel view synthesis. arXiv preprint arXiv:2312.02155, 2023a.
- Pointavatar: Deformable point-based head avatars from videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 21057–21067, 2023b.
- Semi-dense 3d reconstruction with a stereo event camera. In Proceedings of the European conference on computer vision (ECCV), pp. 235–251, 2018.
- Event-based video reconstruction via potential-assisted spiking neural network, 2022.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.