Fréchet Wavelet Distance: A Domain-Agnostic Metric for Image Generation
Abstract: Modern metrics for generative learning like Fr\'echet Inception Distance (FID) and DINOv2-Fr\'echet Distance (FD-DINOv2) demonstrate impressive performance. However, they suffer from various shortcomings, like a bias towards specific generators and datasets. To address this problem, we propose the Fr\'echet Wavelet Distance (FWD) as a domain-agnostic metric based on the Wavelet Packet Transform ($W_p$). FWD provides a sight across a broad spectrum of frequencies in images with a high resolution, preserving both spatial and textural aspects. Specifically, we use $W_p$ to project generated and real images to the packet coefficient space. We then compute the Fr\'echet distance with the resultant coefficients to evaluate the quality of a generator. This metric is general-purpose and dataset-domain agnostic, as it does not rely on any pre-trained network, while being more interpretable due to its ability to compute Fr\'echet distance per packet, enhancing transparency. We conclude with an extensive evaluation of a wide variety of generators across various datasets that the proposed FWD can generalize and improve robustness to domain shifts and various corruptions compared to other metrics.
- Evaluating diffusion models. https://huggingface.co/docs/diffusers/conceptual/evaluation, 2023. Accessed: 2023-10-24.
- Wasserstein generative adversarial networks. In International conference on machine learning, pages 214–223. PMLR, 2017.
- Effectively unbiased FID and inception score and where to find them. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pages 6069–6078. Computer Vision Foundation / IEEE, 2020.
- Ingrid Daubechies. Ten Lectures on Wavelets. Society for Industrial and Applied Mathematics, 1992.
- Watch your up-convolution: Cnn based generative deep neural networks are failing to reproduce spectral distributions. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7890–7899, 2020.
- Swagan: A style-based wavelet-driven generative model. ACM Trans. Graph., 40(4), July 2021.
- Wavelet score-based generative modeling. Advances in Neural Information Processing Systems, 35:478–491, 2022.
- Human motion prediction via spatio-temporal inpainting. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7134–7143, 2019.
- Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30, 2017.
- Denoising diffusion probabilistic models. In Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin, editors, Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020.
- Wavelet-srnet: A wavelet-based cnn for multi-scale face super resolution. In Proceedings of the IEEE international conference on computer vision, pages 1689–1697, 2017.
- Arne Jensen and Anders la Cour-Harbo. Ripples in mathematics: the discrete wavelet transform. Springer Science & Business Media, 2001.
- Progressive growing of gans for improved quality, stability, and variation. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings, 2018.
- Alias-free generative adversarial networks. Advances in Neural Information Processing Systems, 34:852–863, 2021.
- A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4401–4410, 2019.
- Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8110–8119, 2020.
- Learning multiple layers of features from tiny images. 2009.
- The role of imagenet classes in fréchet inception distance. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023.
- Pywavelets: A python package for wavelet analysis. Journal of Open Source Software, 4(36):1237, 2019.
- Wavelet transform-assisted adaptive generative modeling for colorization. IEEE Transactions on Multimedia, 2022.
- Wavelet-based dual-branch network for image demoiréing. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIII 16, pages 86–102. Springer, 2020.
- Attribute-aware face aging with wavelet-based generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11877–11886, 2019.
- Evaluating generative networks using gaussian mixtures of image features. In IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2023, Waikoloa, HI, USA, January 2-7, 2023, pages 279–288. IEEE, 2023.
- Stéphane Mallat. A theory for multiresolution signal decomposition: The wavelet representation. IEEE Trans. Pattern Anal. Mach. Intell., 11(7):674–693, 1989.
- Stéphane Mallat. A wavelet tour of signal processing. Elsevier, 1999.
- Moritz Wolter. Frequency Domain Methods in Recurrent Neural Networks for Sequential Data Processing. PhD thesis, Rheinische Friedrich-Wilhelms-Universität Bonn, July 2021.
- Reliable fidelity and diversity metrics for generative models. In International Conference on Machine Learning, pages 7176–7185. PMLR, 2020.
- Apollo 11 NASA. Ocean world earth. https://commons.wikimedia.org/wiki/File:Ocean_world_Earth.jpg, 1969. Accessed: 2023-10-31.
- Improved denoising diffusion probabilistic models. In Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 8162–8171. PMLR, 18–24 Jul 2021.
- On aliased resizing and surprising subtleties in gan evaluation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11410–11420, 2022.
- Automatic differentiation in pytorch. In 31st Conference on Neural Information Processing Systems (NIPS 2017), 2017.
- Scalable diffusion models with transformers. arXiv preprint arXiv:2212.09748, 2022.
- Wavelet diffusion models are fast and scalable image generators. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, June 17-24, 2023, pages 10199–10208. IEEE, 2023.
- Photoplay Publishing. Albert einstein and charlie chaplin city lights premiere 1931. https://commons.wikimedia.org/wiki/File:Albert_Einstein_and_Charlie_Chaplin_City_Lights_premiere_1931.jpg, 1931. Accessed: 2023-11-09.
- On the spectral bias of neural networks. In International Conference on Machine Learning, pages 5301–5310. PMLR, 2019.
- ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015.
- Palette: Image-to-image diffusion models. In ACM SIGGRAPH 2022 Conference Proceedings, pages 1–10, 2022.
- Improved techniques for training gans. Advances in neural information processing systems, 29, 2016.
- Wire: Wavelet implicit neural representations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18507–18516, 2023.
- Denoising diffusion implicit models. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021, 2021.
- Wavelets and filter banks. SIAM, 1996.
- Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2818–2826, 2016.
- scikit-image: image processing in Python. PeerJ, 2:e453, 6 2014.
- Multiscale transforms with application to image processing. Springer, 2018.
- Multi-level wavelet-based generative adversarial network for perceptual quality enhancement of compressed video. In European Conference on Computer Vision, pages 405–421. Springer, 2020.
- Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4):600–612, 2004.
- Wavelet pooling for convolutional neural networks. In International conference on learning representations, 2018.
- Wavelet-packets for deepfake image analysis and detection. Machine Learning, 111(11):4295–4327, 2022.
- Tackling the generative learning trilemma with denoising diffusion gans. In International Conference on Learning Representations, 2022.
- Photorealistic style transfer via wavelet transforms. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9036–9045, 2019.
- Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. ArXiv, abs/1506.03365, 2015.
- Styleswin: Transformer-based gan for high-resolution image generation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11304–11314, 2022.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.