Papers
Topics
Authors
Recent
Search
2000 character limit reached

BooW-VTON: Boosting In-the-Wild Virtual Try-On via Mask-Free Pseudo Data Training

Published 12 Aug 2024 in cs.CV | (2408.06047v2)

Abstract: Image-based virtual try-on is an increasingly popular and important task to generate realistic try-on images of the specific person. Recent methods model virtual try-on as image mask-inpaint task, which requires masking the person image and results in significant loss of spatial information. Especially, for in-the-wild try-on scenarios with complex poses and occlusions, mask-based methods often introduce noticeable artifacts. Our research found that a mask-free approach can fully leverage spatial and lighting information from the original person image, enabling high-quality virtual try-on. Consequently, we propose a novel training paradigm for a mask-free try-on diffusion model. We ensure the model's mask-free try-on capability by creating high-quality pseudo-data and further enhance its handling of complex spatial information through effective in-the-wild data augmentation. Besides, a try-on localization loss is designed to concentrate on try-on area while suppressing garment features in non-try-on areas, ensuring precise rendering of garments and preservation of fore/back-ground. In the end, we introduce BooW-VTON, the mask-free virtual try-on diffusion model, which delivers SOTA try-on quality without parsing cost. Extensive qualitative and quantitative experiments have demonstrated superior performance in wild scenarios with such a low-demand input.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. Demystifying MMD GANs. In International Conference on Learning Representations.
  2. Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE conference on computer vision and pattern recognition, 7291–7299.
  3. Viton-hd: High-resolution virtual try-on via misalignment-aware normalization. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 14131–14140.
  4. Improving Diffusion Models for Virtual Try-on. CoRR, abs/2403.05139.
  5. Street TryOn: Learning In-the-Wild Virtual Try-On from Unpaired Person Images. CoRR, abs/2311.16094.
  6. Disentangled Cycle Consistency for Highly-Realistic Virtual Try-On. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021, 16928–16937. Computer Vision Foundation / IEEE.
  7. Parser-free virtual try-on via distilling appearance flows. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 8485–8493.
  8. DeepFashion2: A Versatile Benchmark for Detection, Pose Estimation, Segmentation and Re-Identification of Clothing Images. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, 5337–5345. Computer Vision Foundation / IEEE.
  9. Taming the Power of Diffusion Models for High-Quality Virtual Try-On with Appearance Flow. arXiv preprint arXiv:2308.06101.
  10. Densepose: Dense human pose estimation in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition, 7297–7306.
  11. Viton: An image-based virtual try-on network. In Proceedings of the IEEE conference on computer vision and pattern recognition, 7543–7552.
  12. Style-Based Global Appearance Flow for Virtual Try-On. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022, 3460–3469. IEEE.
  13. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30.
  14. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33: 6840–6851.
  15. Do Not Mask What You Do Not Need to Mask: A Parser-Free Virtual Try-On. In Vedaldi, A.; Bischof, H.; Brox, T.; and Frahm, J., eds., Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part XX, volume 12365 of Lecture Notes in Computer Science, 619–635. Springer.
  16. StableVITON: Learning Semantic Correspondence with Latent Diffusion Model for Virtual Try-On. CoRR, abs/2312.01725.
  17. High-Resolution Virtual Try-On with Misalignment and Occlusion-Handled Conditions. arXiv preprint arXiv:2206.14180.
  18. Cp-vton+: Clothing shape and texture preserving image-based virtual try-on. In CVPR Workshops, volume 3, 10–14.
  19. LaDI-VTON: Latent Diffusion Textual-Inversion Enhanced Virtual Try-On. arXiv preprint arXiv:2305.13501.
  20. Dress Code: High-Resolution Multi-Category Virtual Try-On. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2231–2235.
  21. Improved Denoising Diffusion Probabilistic Models. In Meila, M.; and Zhang, T., eds., Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, volume 139 of Proceedings of Machine Learning Research, 8162–8171. PMLR.
  22. DINOv2: Learning Robust Visual Features without Supervision. CoRR, abs/2304.07193.
  23. Learning transferable visual models from natural language supervision. In International conference on machine learning, 8748–8763. PMLR.
  24. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 10684–10695.
  25. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, 234–241. Springer.
  26. Denoising Diffusion Implicit Models. In International Conference on Learning Representations.
  27. Toward characteristic-preserving image-based virtual try-on network. In Proceedings of the European conference on computer vision (ECCV), 589–604.
  28. J. Goodfellow, Jean Pouget-Abadie and Yoshua Bengio. Generative adversarial nets.
  29. OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on. CoRR, abs/2403.01779.
  30. Towards photo-realistic virtual try-on by adaptively generating-preserving image content. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 7850–7859.
  31. Texture-Preserving Diffusion Models for High-Fidelity Virtual Try-On. CoRR, abs/2404.01089.
  32. CAT-DM: Controllable Accelerated Virtual Try-on with Diffusion Model. CoRR, abs/2311.18405.
  33. Transparent Image Layer Diffusion using Latent Transparency. CoRR, abs/2402.17113.
  34. GP-VTON: Towards General Purpose Virtual Try-on via Collaborative Local-Flow Global-Parsing Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.