On the Stability of Iterative Retraining of Generative Models on their own Data
Abstract: Deep generative models have made tremendous progress in modeling complex data, often exhibiting generation quality that surpasses a typical human's ability to discern the authenticity of samples. Undeniably, a key driver of this success is enabled by the massive amounts of web-scale data consumed by these models. Due to these models' striking performance and ease of availability, the web will inevitably be increasingly populated with synthetic content. Such a fact directly implies that future iterations of generative models will be trained on both clean and artificially generated data from past models. In this paper, we develop a framework to rigorously study the impact of training generative models on mixed datasets -- from classical training on real data to self-consuming generative models trained on purely synthetic data. We first prove the stability of iterative training under the condition that the initial generative models approximate the data distribution well enough and the proportion of clean training data (w.r.t. synthetic data) is large enough. We empirically validate our theory on both synthetic and natural images by iteratively training normalizing flows and state-of-the-art diffusion models on CIFAR10 and FFHQ.
- Self-consuming generative models go mad, 2023.
- Palm 2 technical report. arXiv preprint arXiv:2305.10403, 2023.
- Data augmentation generative adversarial networks. 2018.
- T. D. Barfoot. Multivariate gaussian variational inference by natural gradient descent. arXiv preprint arXiv:2001.10025, 2020.
- Language models are few-shot learners. Advances in neural information processing systems, 2020.
- Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712, 2023.
- Neural ordinary differential equations. Advances in neural information processing systems, 31, 2018.
- Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311, 2022.
- A survey for in-context learning. arXiv preprint arXiv:2301.00234, 2022.
- Ffjord: Free-form continuous dynamics for scalable reversible generative models. arXiv preprint arXiv:1810.01367, 2018.
- Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30, 2017.
- Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
- Generalization error in deep learning. In Compressed Sensing and Its Applications: Third International MATHEON Conference 2017, pages 153–193. Springer, 2019.
- Understanding estimation and generalization error of generative adversarial networks. IEEE Transactions on Information Theory, 67(5):3114–3129, 2021.
- Feature likelihood score: Evaluating generalization of generative models using samples, 2023.
- Scaling laws for neural language models. arXiv preprint arXiv:2001.08361, 2020.
- A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4401–4410, 2019.
- Elucidating the design space of diffusion-based generative models. Advances in Neural Information Processing Systems, 35:26565–26577, 2022.
- D. P. Kingma and M. Welling. An introduction to variational autoencoders. Foundations and Trends® in Machine Learning, 12(4):307–392, 2019.
- A. Krizhevsky and G. Hinton. Learning multiple layers of features from tiny images. 2009.
- Improved precision and recall metric for assessing generative models. Advances in Neural Information Processing Systems, 32, 2019.
- S. Lang. Fundamentals of differential geometry. Springer Science & Business Media, 1999.
- Flow matching for generative modeling, Oct. 2022.
- Inequalities: theory of majorization and its applications. 1979.
- Combining generative artificial intelligence (ai) and the internet: Heading towards evolution or degradation? arXiv preprint arXiv:2303.01255, 2023a.
- Towards understanding the interplay of generative artificial intelligence and the internet. arXiv preprint arXiv:2306.06130, 2023b.
- Midjourney. https://www.midjourney.com/home/, 2023. Accessed: 2023-09-09.
- Detectgpt: Zero-shot machine-generated text detection using probability curvature. arXiv preprint arXiv:2301.11305, 2023.
- Performative prediction with neural networks, 2023.
- Diffusion models are minimax optimal distribution estimators. arXiv preprint arXiv:2303.01861, 2023.
- OpenAI. Gpt-4 technical report. ArXiv, abs/2303.08774, 2023.
- Performative prediction. In International Conference on Machine Learning, pages 7599–7609. PMLR, 2020.
- Performative prediction, 2021.
- Zero-shot text-to-image generation. In International Conference on Machine Learning. PMLR, 2021.
- D. Rezende and S. Mohamed. Variational inference with normalizing flows. In International conference on machine learning, pages 1530–1538. PMLR, 2015.
- Can ai-generated text be reliably detected? arXiv preprint arXiv:2303.11156, 2023.
- Assessing generative models via precision and recall. Advances in neural information processing systems, 31, 2018.
- LAION-5B: An open large-scale dataset for training next generation image-text models. Advances in Neural Information Processing Systems, 2022.
- The curse of recursion: Training on generated data makes models forget, 2023.
- Score-based generative modeling through stochastic differential equations. ICLR, 2021.
- Stability AI. https://stability.ai/stablediffusion, 2023. Accessed: 2023-09-09.
- J. Steinhardt. AI Forecasting: One Year In . https://bounded-regret.ghost.io/ai-forecasting-one-year-in/, 2022. Accessed: 2023-09-09.
- Coupling-based invertible neural networks are universal diffeomorphism approximators. Advances in Neural Information Processing Systems, 33:3362–3373, 2020.
- Improving and generalizing flow-based generative models with minibatch optimal transport. arXiv preprint 2302.00482, 2023.
- C. Villani et al. Optimal transport: old and new, volume 338. Springer, 2009.
- De novo design of protein structure and function with rfdiffusion. Nature, pages 1–3, 2023.
- H. Yang and W. E. Generalization error of gan from the discriminator’s perspective, 2021a.
- H. Yang and W. E. Generalization and memorization: The bias potential model, 2021b.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.