Papers
Topics
Authors
Recent
Search
2000 character limit reached

ControlMat: A Controlled Generative Approach to Material Capture

Published 4 Sep 2023 in cs.CV and cs.GR | (2309.01700v3)

Abstract: Material reconstruction from a photograph is a key component of 3D content creation democratization. We propose to formulate this ill-posed problem as a controlled synthesis one, leveraging the recent progress in generative deep networks. We present ControlMat, a method which, given a single photograph with uncontrolled illumination as input, conditions a diffusion model to generate plausible, tileable, high-resolution physically-based digital materials. We carefully analyze the behavior of diffusion models for multi-channel outputs, adapt the sampling process to fuse multi-scale information and introduce rolled diffusion to enable both tileability and patched diffusion for high-resolution outputs. Our generative approach further permits exploration of a variety of materials which could correspond to the input image, mitigating the unknown lighting conditions. We show that our approach outperforms recent inference and latent-space-optimization methods, and carefully validate our diffusion process design choices. Supplemental materials and additional details are available at: https://gvecchio.com/controlmat/.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (66)
  1. Adobe. 2022. Substance Source. https://substance3d.adobe.com/assets/.
  2. Reflectance Modeling by Neural Texture Synthesis. ACM Trans. Graph. 35, 4, Article 65 (jul 2016), 13 pages. https://doi.org/10.1145/2897824.2925917
  3. Two-shot SVBRDF Capture for Stationary Materials. ACM Trans. Graph. 34, 4, Article 110 (July 2015), 13 pages. https://doi.org/10.1145/2766967
  4. Wasserstein generative adversarial networks. In International conference on machine learning. PMLR, 214–223.
  5. MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation. arXiv preprint arXiv:2302.08113 2 (2023).
  6. Large scale GAN training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096 (2018).
  7. Robert L Cook and Kenneth E. Torrance. 1982. A reflectance model for computer graphics. ACM Transactions on Graphics (ToG) 1, 1 (1982), 7–24.
  8. Bin Dai and David Wipf. 2019. Diagnosing and enhancing VAE models. arXiv preprint arXiv:1903.05789 (2019).
  9. Single-Image SVBRDF Capture with a Rendering-Aware Deep Network. ACM Transactions on Graphics (SIGGRAPH Conference Proceedings) 37, 128 (aug 2018), 15. http://www-sop.inria.fr/reves/Basilic/2018/DADDB18
  10. Flexible SVBRDF Capture with a Multi-Image Deep Network. Computer Graphics Forum(Eurographics Symposium on Rendering Conference Proceedings) 38, 4 (jul 2019), 13. http://www-sop.inria.fr/reves/Basilic/2019/DADDB19
  11. Guided Fine-Tuning for Large-Scale Material Transfer. Computer Graphics Forum (Proceedings of the Eurographics Symposium on Rendering) 39, 4 (2020). http://www-sop.inria.fr/reves/Basilic/2020/DDB20
  12. Prafulla Dhariwal and Alexander Nichol. 2021. Diffusion models beat gans on image synthesis. Advances in Neural Information Processing Systems 34 (2021), 8780–8794.
  13. Alexey Dosovitskiy and Thomas Brox. 2016. Generating images with perceptual similarity metrics based on deep networks. Advances in neural information processing systems 29 (2016).
  14. Taming transformers for high-resolution image synthesis. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 12873–12883.
  15. Michael Fischer and Tobias Ritschel. 2022. Metappearance: Meta-Learning for Visual Appearance Reproduction. ACM Trans Graph (Proc. SIGGRAPH Asia) 41, 4 (2022).
  16. Deep Inverse Rendering for High-Resolution SVBRDF Estimation from an Arbitrary Number of Images. ACM Trans. Graph. 38, 4, Article 134 (jul 2019), 15 pages. https://doi.org/10.1145/3306346.3323042
  17. Generative Adversarial Nets. In Advances in Neural Information Processing Systems, Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K.Q. Weinberger (Eds.), Vol. 27. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf
  18. BRDF Representation and Acquisition. In Proceedings of the 37th Annual Conference of the European Association for Computer Graphics: State of the Art Reports (Lisbon, Portugal) (EG ’16). Eurographics Association, Goslar, DEU, 625–650.
  19. MatFormer: A Generative Model for Procedural Materials. ACM Trans. Graph. 41, 4, Article 46 (jul 2022), 12 pages. https://doi.org/10.1145/3528223.3530173
  20. Improved training of wasserstein gans. Advances in neural information processing systems 30 (2017).
  21. Highlight-Aware Two-Stream Network for Single-Image SVBRDF Acquisition. ACM Trans. Graph. 40, 4, Article 123 (jul 2021), 14 pages. https://doi.org/10.1145/3450626.3459854
  22. Ultra-High Resolution SVBRDF Recovery from a Single Image. ACM Trans. Graph. (apr 2023). https://doi.org/10.1145/3593798 Just Accepted.
  23. MaterialGAN: Reflectance Capture Using a Generative SVBRDF Model. ACM Trans. Graph. 39, 6, Article 254 (nov 2020), 13 pages. https://doi.org/10.1145/3414685.3417779
  24. Generative Modelling of BRDF Textures from Flash Images. ACM Trans Graph (Proc. SIGGRAPH Asia) 40, 6 (2021).
  25. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems 33 (2020), 6840–6851.
  26. A Novel Framework for Inverse Procedural Texture Modeling. ACM Trans. Graph. 38, 6, Article 186 (Nov. 2019), 14 pages. https://doi.org/10.1145/3355089.3356516
  27. Node Graph Optimization Using Differentiable Proxies. In ACM SIGGRAPH 2022 Conference Proceedings (Vancouver, BC, Canada) (SIGGRAPH ’22). Association for Computing Machinery, New York, NY, USA, Article 5, 9 pages. https://doi.org/10.1145/3528233.3530733
  28. Generating Procedural Materials from Text or Image Prompts. In ACM SIGGRAPH 2023 Conference Proceedings.
  29. An Inverse Procedural Modeling Pipeline for SVBRDF Maps. ACM Trans. Graph. 41, 2, Article 18 (jan 2022), 17 pages. https://doi.org/10.1145/3502431
  30. Noise2Music: Text-conditioned Music Generation with Diffusion Models. arXiv:2302.03917 [cs.SD]
  31. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1125–1134.
  32. Álvaro Barbero Jiménez. 2023. Mixture of Diffusers for scene composition and high resolution image generation. arXiv preprint arXiv:2302.02412 (2023).
  33. Brian Karis. 2013. Real shading in unreal engine 4. Proc. Physically Based Shading Theory Practice 4, 3 (2013), 1.
  34. Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196 (2017).
  35. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 8110–8119.
  36. Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).
  37. Modeling Surface Appearance from a Single Photograph Using Self-Augmented Convolutional Neural Networks. ACM Trans. Graph. 36, 4, Article 45 (jul 2017), 11 pages. https://doi.org/10.1145/3072959.3073641
  38. MaterIA: Single Image High-Resolution Material Capture in the Wild. Computer Graphics Forum 41, 2 (2022), 163–177. https://doi.org/10.1111/cgf.14466 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/cgf.14466
  39. Wes McDermott. 2018. Maps common to both workflow. Allergorithmic, 75–79. https://substance3d.adobe.com/tutorials/courses/the-pbr-guide-part-2
  40. Lars Mescheder. 2018. On the convergence properties of gan training. arXiv preprint arXiv:1801.04406 1 (2018), 16.
  41. Unrolled generative adversarial networks. arXiv preprint arXiv:1611.02163 (2016).
  42. Learning transferable visual models from natural language supervision. In International conference on machine learning. PMLR, 8748–8763.
  43. Hierarchical Text-Conditional Image Generation with CLIP Latents. arXiv:2204.06125 [cs.CV]
  44. Stochastic backpropagation and approximate inference in deep generative models. In International conference on machine learning. PMLR, 1278–1286.
  45. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10684–10695.
  46. Making sense of cnns: Interpreting deep representations and their invariances with inns. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVII 16. Springer, 647–664.
  47. Network-to-network translation with conditional invertible neural networks. Advances in Neural Information Processing Systems 33 (2020), 2784–2797.
  48. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer, 234–241.
  49. Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding. arXiv:2205.11487 [cs.CV]
  50. MATch: Differentiable Material Graphs for Procedural Material Capture. ACM Trans. Graph. 39, 6, Article 196 (Dec. 2020), 15 pages.
  51. Deep unsupervised learning using nonequilibrium thermodynamics. In International Conference on Machine Learning. PMLR, 2256–2265.
  52. Consistency Models. arXiv:2303.01469 [cs.LG]
  53. Neural discrete representation learning. Advances in neural information processing systems 30 (2017).
  54. Attention is all you need. Advances in neural information processing systems 30 (2017).
  55. SurfaceNet: Adversarial SVBRDF Estimation from a Single Image. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 12840–12848.
  56. MatFuse: Controllable Material Generation with Diffusion Models. arXiv:2308.11408 [cs.CV]
  57. Microfacet models for refraction through rough surfaces. In Proceedings of the 18th Eurographics conference on Rendering Techniques. 195–206.
  58. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing 13, 4 (2004), 600–612.
  59. Designing a practical degradation model for deep blind image super-resolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4791–4800.
  60. Adding Conditional Control to Text-to-Image Diffusion Models.
  61. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition. 586–595.
  62. PhotoMat: A Material Generator Learned from Single Flash Photos. In ACM SIGGRAPH 2023 Conference Proceedings (Los Angeles, CA, USA) (SIGGRAPH ’23). Association for Computing Machinery, New York, NY, USA.
  63. TileGen: Tileable, Controllable Material Generation and Capture. In SIGGRAPH Asia 2022 Conference Papers (Daegu, Republic of Korea) (SA ’22). Association for Computing Machinery, New York, NY, USA, Article 34, 9 pages. https://doi.org/10.1145/3550469.3555403
  64. A Semi-Procedural Convolutional Material Prior. Computer Graphics Forum n/a, n/a (2023). https://doi.org/10.1111/cgf.14781 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/cgf.14781
  65. Xilong Zhou and Nima Khademi Kalantari. 2021. Adversarial Single-Image SVBRDF Estimation with Hybrid Training. Computer Graphics Forum (2021).
  66. Xilong Zhou and Nima Khademi Kalantari. 2022. Look-Ahead Training with Learned Reflectance Loss for Single-Image SVBRDF Estimation. ACM Transactions on Graphics 41, 6 (12 2022). https://doi.org/10.1145/3550454.3555495
Citations (23)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.