Neural Network Diffusion
Abstract: Diffusion models have achieved remarkable success in image and video generation. In this work, we demonstrate that diffusion models can also \textit{generate high-performing neural network parameters}. Our approach is simple, utilizing an autoencoder and a diffusion model. The autoencoder extracts latent representations of a subset of the trained neural network parameters. Next, a diffusion model is trained to synthesize these latent representations from random noise. This model then generates new representations, which are passed through the autoencoder's decoder to produce new subsets of high-performing network parameters. Across various architectures and datasets, our approach consistently generates models with comparable or improved performance over trained networks, with minimal additional cost. Notably, we empirically find that the generated models are not memorizing the trained ones. Our results encourage more exploration into the versatile use of diffusion models. Our code is available \href{https://github.com/NUS-HPC-AI-Lab/Neural-Network-Diffusion}{here}.
- Analytic-DPM: an analytic estimate of the optimal reverse variance in diffusion probabilistic models. In ICLR, 2022. URL https://openreview.net/forum?id=0xiJLKH-ufZ.
- Weight uncertainty in neural network. In ICML. PMLR, 2015.
- Food-101–mining discriminative components with random forests. In ECCV. Springer, 2014.
- Bottou, L. et al. Stochastic gradient learning in neural networks. Proceedings of Neuro-Nımes, 91(8), 1991.
- Large scale gan training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096, 2018a.
- SMASH: One-shot model architecture search through hypernetworks. In ICLR, 2018b. URL https://openreview.net/forum?id=rydeCEhs-.
- Language models are few-shot learners. NeurIPS, 33, 2020.
- An analysis of single-layer networks in unsupervised feature learning. In Proceedings of the fourteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 2011.
- An introduction to support vector machines and other kernel-based learning methods. Cambridge university press, 2000.
- Imagenet: A large-scale hierarchical image database. In CVPR. Ieee, 2009.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
- Diffusion models beat gans on image synthesis. NeurIPS, 34, 2021.
- Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516, 2014.
- An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
- Hyperdiffusion: Generating implicit neural fields with weight-space diffusion. arXiv preprint arXiv:2303.17015, 2023.
- Score-based diffusion models as principled priors for inverse imaging. arXiv preprint arXiv:2304.11751, 2023.
- Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In ICML. PMLR, 2016.
- Graves, A. Practical variational inference for neural networks. NeurIPS, 24, 2011.
- Hypernetworks. In ICLR, 2017. URL https://openreview.net/forum?id=rkpACe1lx.
- Deep residual learning for image recognition. In CVPR, 2016.
- Momentum contrast for unsupervised visual representation learning. In CVPR, 2020.
- Masked autoencoders are scalable vision learners. In CVPR, 2022.
- Prompt-to-prompt image editing with cross-attention control. In ICLR, 2023. URL https://openreview.net/forum?id=_CDixzkzeyb.
- Denoising diffusion probabilistic models. NeurIPS, 33, 2020.
- Imagen video: High definition video generation with diffusion models. arXiv preprint arXiv:2210.02303, 2022.
- Image-to-image translation with conditional adversarial networks. In CVPR, 2017.
- Jarzynski, C. Equilibrium free-energy differences from nonequilibrium measurements: A master-equation approach. Physical Review E, 56(5), 1997.
- Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
- Variational dropout and the local reparameterization trick. NeurIPS, 28, 2015.
- Learning multiple layers of features from tiny images. 2009.
- Imagenet classification with deep convolutional neural networks. NeurIPS, 25, 2012.
- Neural network ensembles, cross validation, and active learning. NeurIPS, 7, 1994.
- Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 1998.
- Your diffusion model is secretly a zero-shot classifier. arXiv preprint arXiv:2303.16203, 2023.
- Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pp. 740–755. Springer, 2014.
- A convnet for the 2020s. In CVPR, 2022.
- Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3431–3440, 2015.
- DPM-solver: A fast ODE solver for diffusion probabilistic model sampling in around 10 steps. In Oh, A. H., Agarwal, A., Belgrave, D., and Cho, K. (eds.), NeurIPS, 2022. URL https://openreview.net/forum?id=2uAaGwlP_V.
- Network information criterion-determining the number of hidden units for an artificial neural network model. IEEE transactions on neural networks, 5(6), 1994.
- Neal, R. M. Bayesian learning for neural networks, volume 118. Springer Science & Business Media, 2012.
- Glide: Towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:2112.10741, 2021.
- Automated flower classification over a large number of classes. In 2008 Sixth Indian conference on computer vision, graphics & image processing. IEEE, 2008.
- Cats and dogs. In CVPR. IEEE, 2012.
- Scalable diffusion models with transformers. arXiv preprint arXiv:2212.09748, 2022.
- Learning to learn with generative models of neural network checkpoints, 2023. URL https://openreview.net/forum?id=JXkz3zm8gJ.
- Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 1(2), 2022.
- Generating diverse high-fidelity images with vq-vae-2. NeurIPS, 32, 2019.
- Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28, 2015.
- Variational inference with normalizing flows. In ICML. PMLR, 2015.
- Stochastic backpropagation and approximate inference in deep generative models. In ICML. PMLR, 2014.
- High-resolution image synthesis with latent diffusion models. In CVPR, 2022.
- U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pp. 234–241. Springer, 2015.
- Photorealistic text-to-image diffusion models with deep language understanding. NeurIPS, 35, 2022.
- Feed forward neural networks with random weights. In ICPR. IEEE Computer Society Press, 1992.
- Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
- Deep unsupervised learning using nonequilibrium thermodynamics. In ICML. PMLR, 2015.
- Chaos in random neural networks. Physical review letters, 61(3), 1988.
- Denoising diffusion implicit models. In ICLR, 2021. URL https://openreview.net/forum?id=St1giarCHLP.
- Generative modeling by estimating gradients of the data distribution. NeurIPS, 32, 2019.
- Fcos: A simple and strong anchor-free object detector. IEEE T-PAMI, 44(4):1922–1933, 2020.
- Visualizing data using t-sne. JMLR, 9(11), 2008.
- Bayesian learning via stochastic gradient langevin dynamics. In ICML, 2011.
- Wightman, R. Pytorch image models. https://github.com/rwightman/pytorch-image-models, 2019.
- Wong, E. Stochastic neural networks. Algorithmica, 6(1-6), 1991.
- Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time. In ICML, pp. 23965–23998. PMLR, 2022.
- Metadiff: Meta-learning with conditional diffusion for few-shot learning. arXiv preprint arXiv:2307.16424, 2023.
- Unpaired image-to-image translation using cycle-consistent adversarial networks. In ICCV, 2017.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.