A Differential Equation Approach for Wasserstein GANs and Beyond
Abstract: This paper proposes a new theoretical lens to view Wasserstein generative adversarial networks (WGANs). To minimize the Wasserstein-1 distance between the true data distribution and our estimate of it, we derive a distribution-dependent ordinary differential equation (ODE) which represents the gradient flow of the Wasserstein-1 loss, and show that a forward Euler discretization of the ODE converges. This inspires a new class of generative models that naturally integrates persistent training (which we call W1-FE). When persistent training is turned off, we prove that W1-FE reduces to WGAN. When we intensify persistent training, W1-FE is shown to outperform WGAN in training experiments from low to high dimensions, in terms of both convergence speed and training results. Intriguingly, one can reap the benefits only when persistent training is carefully integrated through our ODE perspective. As demonstrated numerically, a naive inclusion of persistent training in WGAN (without relying on our ODE framework) can significantly worsen training results.
- L. Ambrosio. Lecture notes on optimal transport problems, 2000. URL http://cvgmt.sns.it/paper/1008/. cvgmt preprint.
- Gradient flows in metric spaces and in the space of probability measures. Lectures in Mathematics ETH Zürich. Birkhäuser Verlag, Basel, second edition, 2008. ISBN 978-3-7643-8721-1.
- Wasserstein generative adversarial networks. In D. Precup and Y. W. Teh, editors, Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 214–223. PMLR, 06–11 Aug 2017. URL https://proceedings.mlr.press/v70/arjovsky17a.html.
- L. Deng. The mnist database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 29(6):141–142, 2012.
- Faster SGD training by minibatch persistency. CoRR, abs/1806.07353, 2018. URL http://arxiv.org/abs/1806.07353.
- Generative adversarial networks, 2014.
- Improved training of wasserstein gans. Advances in Neural Information Processing Systems, 30, 2017.
- Y. J. Huang and Y. Zhang. Gans as gradient flows that converge. Journal of Machine Learning Research, 24(217):1–40, 2023. URL http://jmlr.org/papers/v24/22-0583.html.
- J. J. Hull. A database for handwritten text recognition research. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(5):550–554, 1994. doi: 10.1109/34.291440.
- B. Jourdain and A. Tse. Central limit theorem over non-linear functionals of empirical measures with applications to the mean-field fluctuation of interacting diffusions. Electron. J. Probab., 26:Paper No. 154, 34, 2021. ISSN 1083-6489. doi: 10.1214/21-ejp720. URL https://doi.org/10.1214/21-ejp720.
- Adversarial computation of optimal transport maps. CoRR, abs/1906.09691, 2019. URL http://arxiv.org/abs/1906.09691.
- Unrolled generative adversarial networks, 2017.
- On the regularization of wasserstein GANs. In International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=B1hYRMbCW.
- F. Santambrogio. Optimal transport for applied mathematicians, volume 87 of Progress in Nonlinear Differential Equations and their Applications. Birkhäuser/Springer, Cham, 2015. ISBN 978-3-319-20827-5; 978-3-319-20828-2. doi: 10.1007/978-3-319-20828-2. URL https://doi.org/10.1007/978-3-319-20828-2. Calculus of variations, PDEs, and modeling.
- Large-scale optimal transport and mapping estimation, 2017. URL https://arxiv.org/abs/1711.02283.
- C. Villani. Optimal transport, volume 338 of Grundlehren der mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer-Verlag, Berlin, 2009. ISBN 978-3-540-71049-3. doi: 10.1007/978-3-540-71050-9. URL https://doi.org/10.1007/978-3-540-71050-9. Old and new.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.