Papers
Topics
Authors
Recent
Search
2000 character limit reached

Towards Effective Data-Free Knowledge Distillation via Diverse Diffusion Augmentation

Published 23 Oct 2024 in cs.CV and cs.AI | (2410.17606v1)

Abstract: Data-free knowledge distillation (DFKD) has emerged as a pivotal technique in the domain of model compression, substantially reducing the dependency on the original training data. Nonetheless, conventional DFKD methods that employ synthesized training data are prone to the limitations of inadequate diversity and discrepancies in distribution between the synthesized and original datasets. To address these challenges, this paper introduces an innovative approach to DFKD through diverse diffusion augmentation (DDA). Specifically, we revise the paradigm of common data synthesis in DFKD to a composite process through leveraging diffusion models subsequent to data synthesis for self-supervised augmentation, which generates a spectrum of data samples with similar distributions while retaining controlled variations. Furthermore, to mitigate excessive deviation in the embedding space, we introduce an image filtering technique grounded in cosine similarity to maintain fidelity during the knowledge distillation process. Comprehensive experiments conducted on CIFAR-10, CIFAR-100, and Tiny-ImageNet datasets showcase the superior performance of our method across various teacher-student network configurations, outperforming the contemporary state-of-the-art DFKD methods. Code will be available at:https://github.com/SLGSP/DDA.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (53)
  1. Hritik Bansal and Aditya Grover. 2023. Leaving Reality to Imagination: Robust Classification via Generated Datasets. CoRR abs/2302.02503 (2023).
  2. Robust and Resource-Efficient Data-Free Knowledge Distillation by Generative Pseudo Replay. In Thirty-Sixth AAAI Conference on Artificial Intelligence, AAAI 2022, Thirty-Fourth Conference on Innovative Applications of Artificial Intelligence. AAAI Press, 6089–6096.
  3. Self-driving cars and data collection: Privacy perceptions of networked autonomous vehicles. In Thirteenth Symposium on Usable Privacy and Security, SOUPS. USENIX Association, 357–375.
  4. Data-Free Learning of Student Networks. In IEEE/CVF International Conference on Computer Vision, ICCV. IEEE, 3513–3521.
  5. Data-Free Network Quantization With Adversarial Knowledge Distillation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR Workshops. Computer Vision Foundation / IEEE, 3047–3057.
  6. Diffusion Models in Vision: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 45, 9 (2023), 10850–10869.
  7. Weidong Du. 2022. Progressive Network Grafting With Local Features Embedding for Few-Shot Knowledge Distillation. IEEE Access 10 (2022), 116196–116204.
  8. Up to 100x Faster Data-Free Knowledge Distillation. In Thirty-Sixth AAAI Conference on Artificial Intelligence, AAAI 2022, Thirty-Fourth Conference on Innovative Applications of Artificial Intelligence, IAAI. AAAI Press, 6597–6604.
  9. Contrastive Model Invertion for Data-Free Knolwedge Distillation. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI. 2374–2380.
  10. Diverse Data Augmentation with Diffusions for Effective Test-time Prompt Tuning. In IEEE/CVF International Conference on Computer Vision, ICCV. IEEE, 2704–2714.
  11. GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification. Neurocomputing 321 (2018), 321–331.
  12. Improved Training of Wasserstein GANs. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems NeurIPS. 5767–5777.
  13. CDFKD-MFS: Collaborative Data-Free Knowledge Distillation via Multi-Level Feature Sharing. IEEE Trans. Multim. 24 (2022), 4262–4274.
  14. Deep Residual Learning for Image Recognition. CoRR abs/1512.03385 (2015).
  15. AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty. In 8th International Conference on Learning Representations, ICLR. OpenReview.net.
  16. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems. 6626–6637.
  17. Distilling the Knowledge in a Neural Network. CoRR abs/1503.02531 (2015).
  18. AugGAN: Cross Domain Adaptation with GAN-Based Data Augmentation. In Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, Proceedings, Part IX, Vol. 11213. Springer, 731–744.
  19. Knowledge Distillation from A Stronger Teacher. In Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems NeurIPS, Sanmi Koyejo, S. Mohamed, A. Agarwal, Danielle Belgrave, K. Cho, and A. Oh (Eds.).
  20. Puzzle Mix: Exploiting Saliency and Local Statistics for Optimal Mixup. In Proceedings of the 37th International Conference on Machine Learning, ICML (Proceedings of Machine Learning Research, Vol. 119). PMLR, 5275–5285.
  21. Diederik P. Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. In 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.).
  22. Ya Le and Xuan Yang. 2015. Tiny imagenet visual recognition challenge. Technical Report 7, 7 (2015), 3.
  23. Data-Free Knowledge Transfer: A Survey. CoRR abs/2112.15278 (2021).
  24. AutoMix: Unveiling the Power of Mixup for Stronger Classifiers. In Computer Vision - ECCV 2022: 17th European Conference, Tel Aviv, Israel, Part XXIV, Vol. 13684. Springer, 441–458.
  25. Data-Free Knowledge Distillation for Deep Neural Networks. CoRR abs/1710.07535 (2017).
  26. Large-Scale Generative Data-Free Distillation. CoRR abs/2012.05578 (2020).
  27. Customizing Synthetic Data for Data-Free Student Learning. In IEEE International Conference on Multimedia and Expo, ICME. IEEE, 1817–1822.
  28. Paul Micaelli and Amos J. Storkey. 2019. Zero-shot Knowledge Transfer via Adversarial Belief Matching. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems, NeurIPS, Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’Alché-Buc, Emily B. Fox, and Roman Garnett (Eds.). 9547–9557.
  29. Zero-Shot Knowledge Distillation in Deep Networks. In Proceedings of the 36th International Conference on Machine Learning, ICML, Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.), Vol. 97. PMLR, 4743–4751.
  30. Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems NeurIPS.
  31. An Improved Face Image Restoration Method Based on Denoising Diffusion Probabilistic Models. IEEE Access 12 (2024), 3581–3596.
  32. Learning to Retain while Acquiring: Combating Distribution-Shift in Adversarial Data-Free Knowledge Distillation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR. IEEE, 7786–7794.
  33. GAN-Supervised Dense Visual Alignment. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR. IEEE, 13460–13471.
  34. High-Resolution Image Synthesis with Latent Diffusion Models. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR. IEEE, 10674–10685.
  35. Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding. In Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems, NeurIPS.
  36. Generate to Adapt: Aligning Domains Using Generative Adversarial Networks. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR. Computer Vision Foundation / IEEE Computer Society, 8503–8512.
  37. LAION-5B: An open large-scale dataset for training next generation image-text models. In Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems NeurIPS .
  38. Contrastive Examples for Addressing the Tyranny of the Majority. CoRR abs/2004.06524 (2020).
  39. Connor Shorten and Taghi M. Khoshgoftaar. 2019. A survey on Image Data Augmentation for Deep Learning. J. Big Data 6 (2019), 60.
  40. Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. In Proceedings of the 3rd International Conference on Learning Representations, ICLR, Yoshua Bengio and Yann LeCun (Eds.).
  41. D2C: Diffusion-Decoding Models for Few-Shot Conditional Generation. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual. 12533–12548.
  42. A Survey on Transformer Compression. CoRR abs/2402.05964 (2024).
  43. Effective Data Augmentation With Diffusion Models. CoRR abs/2302.07944 (2023).
  44. Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination. CoRR abs/1805.01978 (2018).
  45. Dreaming to Distill: Data-Free Knowledge Transfer via DeepInversion. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR. Computer Vision Foundation / IEEE, 8712–8721.
  46. Knowledge Extraction with No Observable Data. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems, NeurIPS. 2701–2710.
  47. Data-Free Knowledge Distillation via Feature Exchange and Activation Region Constraint. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR. IEEE, 24266–24275.
  48. CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features. In IEEE/CVF International Conference on Computer Vision, ICCV. IEEE, 6022–6031.
  49. Sergey Zagoruyko and Nikos Komodakis. 2016. Wide Residual Networks. CoRR abs/1605.07146 (2016).
  50. mixup: Beyond Empirical Risk Minimization. In Proceedings of the 6th International Conference on Learning Representations, ICLR. OpenReview.net.
  51. ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model. In IEEE/CVF International Conference on Computer Vision, ICCV. IEEE, 364–373.
  52. Decoupled Knowledge Distillation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR. IEEE, 11943–11952.
  53. Conditional Text Image Generation with Diffusion Models. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR. IEEE, 14235–14244.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.