Bayesian Diffusion Models for 3D Shape Reconstruction
Abstract: We present Bayesian Diffusion Models (BDM), a prediction algorithm that performs effective Bayesian inference by tightly coupling the top-down (prior) information with the bottom-up (data-driven) procedure via joint diffusion processes. We show the effectiveness of BDM on the 3D shape reconstruction task. Compared to prototypical deep learning data-driven approaches trained on paired (supervised) data-labels (e.g. image-point clouds) datasets, our BDM brings in rich prior information from standalone labels (e.g. point clouds) to improve the bottom-up 3D reconstruction. As opposed to the standard Bayesian frameworks where explicit prior and likelihood are required for the inference, BDM performs seamless information fusion via coupled diffusion processes with learned gradient computation networks. The specialty of our BDM lies in its capability to engage the active and effective information exchange and fusion of the top-down and bottom-up processes where each itself is a diffusion process. We demonstrate state-of-the-art results on both synthetic and real-world benchmarks for 3D shape reconstruction.
- Learning representations and generative models for 3d point clouds. In ICML, 2018.
- Renderdiffusion: Image diffusion for 3d reconstruction, inpainting and generation. In CVPR, 2023.
- An introduction to mcmc for machine learning. Machine Learning, 50:5–43, 2003.
- Statistical approach to shape from shading: Reconstruction of three-dimensional face surfaces from single two-dimensional images. Neural Computation, 8(6):1321–1340, 1996.
- Bayesian Theory. 2009.
- Active contours: the application of techniques from graphics, vision, control theory and statistics to visual tracking of shapes in motion. 2012.
- Latent dirichlet allocation. JMLR, 3(Jan):993–1022, 2003.
- Language models are few-shot learners. In NeurIPS, 2020.
- Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:1512.03012, 2015.
- Visual hull alignment and refinement across time: A 3d reconstruction algorithm combining shape-from-silhouette with stereo. In CVPR, 2003.
- Implicit functions in feature space for 3d shape reconstruction and completion. In CVPR, 2020.
- 3d-r2n2: A unified approach for single and multi-view 3d object reconstruction. In ECCV, 2016.
- Active shape models-their training and application. Computer Vision and Image Understanding, 61(1):38–59, 1995.
- Imagenet: A large-scale hierarchical image database. In CVPR, 2009.
- Ccd-3dr: Consistent conditioning in diffusion for single-image 3d reconstruction. arXiv preprint arXiv:2308.07837, 2023.
- An image is worth 16x16 words: Transformers for image recognition at scale. In ICLR, 2021.
- A point set generation network for 3d object reconstruction from a single image. In CVPR, 2017.
- A bayesian hierarchical model for learning natural scene categories. In CVPR, 2005.
- Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. Computer Vision and Image Understanding, 106:59–70, 2007.
- Pictorial structures for object recognition. IJCV, 61:55–79, 2005.
- Object class recognition by unsupervised scale-invariant learning. In CVPR, 2003.
- Vincent Fortuin. Priors in bayesian deep learning: A review. International Statistical Review, 90(3):563–591, 2022.
- 3d-front: 3d furnished rooms with layouts and semantics. In ICCV, 2021.
- 3d shape induction from 2d views of multiple objects. In 3DV, 2017.
- Learning a predictable and generative vector representation for objects. In ECCV, 2016.
- Generative adversarial nets. In NeurIPS, 2014.
- Caltech-256 object category dataset. 2007.
- Deep residual learning for image recognition. In CVPR, 2016.
- Escaping plato’s cave: 3d shape from adversarial rendering. In ICCV, 2019.
- Classifier-free diffusion guidance. In NeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications, 2021.
- Denoising diffusion probabilistic models. In NeurIPS, 2020.
- Berthold KP Horn. Shape from shading: A method for obtaining the shape of a smooth opaque object from one view. 1970.
- Introspective classification with convolutional nets. NeurIPS, 2017.
- Hierarchical mixtures of experts and the em algorithm. Neural Computation, 6(2):181–214, 1994.
- Learning category-specific mesh reconstruction from image collections. In ECCV, 2018.
- Category-specific object reconstruction from a single image. In CVPR, 2015.
- Auto-encoding variational bayes. In ICLR, 2014.
- Perception as Bayesian inference. 1996.
- Imagenet classification with deep convolutional neural networks. In NeurIPS, 2012.
- Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In ICML, 2001.
- Stan Z Li. Markov random field modeling in image analysis. 2009.
- Magic3d: High-resolution text-to-3d content creation. In CVPR, 2023.
- Single image depth estimation from predicted semantic labels. In CVPR, 2010.
- One-2-3-45: Any single image to 3d mesh in 45 seconds without per-shape optimization. In NeurIPS, 2023a.
- Zero-1-to-3: Zero-shot one image to 3d object. In ICCV, 2023b.
- Point-voxel cnn for efficient 3d deep learning. In NeurIPS, 2019a.
- Point-voxel cnn for efficient 3d deep learning. In NeurIPS, 2019b.
- David G Lowe. Distinctive image features from scale-invariant keypoints. IJCV, 60:91–110, 2004.
- Diffusion probabilistic models for 3d point cloud generation. In CVPR, 2021.
- Dense 3d point cloud reconstruction using a deep pyramid network. In WACV, 2019.
- David Marr. Vision: A computational investigation into the human representation and processing of visual information. 2010.
- Pc2: Projection-conditioned point cloud diffusion for single-image 3d reconstruction. In CVPR, 2023.
- Occupancy networks: Learning 3d reconstruction in function space. In CVPR, 2019.
- Dit-3d: Exploring plain diffusion transformers for 3d shape generation. In NeurIPS, 2023.
- Point-e: A system for generating 3d point clouds from complex prompts. 2022.
- Automated flower classification over a large number of classes. In Sixth Indian conference on computer vision, graphics & image processing, pages 722–729, 2008.
- Deepsdf: Learning continuous signed distance functions for shape representation. In CVPR, 2019.
- Scalable diffusion models with transformers. In ICCV, 2023.
- Dreamfusion: Text-to-3d using 2d diffusion. In ICLR, 2023.
- Pointnet: Deep learning on point sets for 3d classification and segmentation. In CVPR, 2017.
- Faster r-cnn: Towards real-time object detection with region proposal networks. In NeurIPS, 2015.
- High-resolution image synthesis with latent diffusion models. In CVPR, 2022.
- Pifu: Pixel-aligned implicit function for high-resolution clothed human digitization. In ICCV, 2019.
- Deep unsupervised learning using nonequilibrium thermodynamics. In ICML, 2015.
- Score-based generative modeling through stochastic differential equations. In ICLR, 2020.
- Pix3d: Dataset and methods for single-image 3d shape modeling. In CVPR, 2018.
- What do single-view 3d reconstruction networks learn? In CVPR, 2019.
- Zhuowen Tu. Learning generative models via discriminative approaches. In CVPR, 2007.
- Image segmentation by data-driven markov chain monte carlo. TPAMI, 24(5):657–673, 2002.
- Image parsing: Unifying segmentation, detection, and recognition. IJCV, 63:113–140, 2005.
- Attention is all you need. In NeurIPS, 2017.
- Rapid object detection using a boosted cascade of simple features. In CVPR, 2001.
- Prolificdreamer: High-fidelity and diverse text-to-3d generation with variational score distillation. In NeurIPS, 2023.
- Unsupervised learning of models for recognition. Lecture Notes in Computer Science, 2000.
- Bayesian learning via stochastic gradient langevin dynamics. In ICML, 2011.
- Andrew P Witkin. Recovering surface shape and orientation from texture. Artificial Intelligence, 17(1-3):17–45, 1981.
- Global stereo reconstruction under second-order smoothness priors. TPAMI, 31(12):2115–2128, 2009.
- Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling. In NeurIPS, 2016.
- Density-aware chamfer distance as a comprehensive metric for point cloud completion. In NeurIPS, 2021.
- CASA: Category-agnostic skeletal animal reconstruction. In NeurIPS, 2022.
- 3d shapenets: A deep representation for volumetric shapes. In CVPR, 2015.
- Pix2vox: Context-aware 3d reconstruction from single and multi-view images. In ICCV, 2019.
- Pix2vox++: Multi-scale context-aware 3d object reconstruction from single and multiple images. IJCV, 128(12):2919–2935, 2020.
- Pointflow: 3d point cloud generation with continuous normalizing flows. In ICCV, 2019.
- Vision as bayesian inference: analysis by synthesis? Trends in Cognitive Sciences, 10(7):301–308, 2006.
- Lion: Latent point diffusion models for 3d shape generation. In NeurIPS, 2022.
- Adding conditional control to text-to-image diffusion models. In ICCV, 2023a.
- Uni-3d: A universal model for panoptic 3d scene reconstruction. In ICCV, 2023b.
- 3d shape generation and completion through point-voxel diffusion. In ICCV, 2021.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.