Papers
Topics
Authors
Recent
Search
2000 character limit reached

Buffer-based Gradient Projection for Continual Federated Learning

Published 3 Sep 2024 in cs.LG and cs.DC | (2409.01585v1)

Abstract: Continual Federated Learning (CFL) is essential for enabling real-world applications where multiple decentralized clients adaptively learn from continuous data streams. A significant challenge in CFL is mitigating catastrophic forgetting, where models lose previously acquired knowledge when learning new information. Existing approaches often face difficulties due to the constraints of device storage capacities and the heterogeneous nature of data distributions among clients. While some CFL algorithms have addressed these challenges, they frequently rely on unrealistic assumptions about the availability of task boundaries (i.e., knowing when new tasks begin). To address these limitations, we introduce Fed-A-GEM, a federated adaptation of the A-GEM method (Chaudhry et al., 2019), which employs a buffer-based gradient projection approach. Fed-A-GEM alleviates catastrophic forgetting by leveraging local buffer samples and aggregated buffer gradients, thus preserving knowledge across multiple clients. Our method is combined with existing CFL techniques, enhancing their performance in the CFL context. Our experiments on standard benchmarks show consistent performance improvements across diverse scenarios. For example, in a task-incremental learning scenario using the CIFAR-100 dataset, our method can increase the accuracy by up to 27%. Our code is available at https://github.com/shenghongdai/Fed-A-GEM.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (68)
  1. Expert Gate: Lifelong learning with a network of experts. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.  3366–3375, 2017.
  2. Memory aware synapses: Learning what (not) to forget. In Proceedings of the European Conference on Computer Vision, pp.  139–154, 2018.
  3. Gradient based sample selection for online continual learning. Advances in Neural Information Processing Systems, 32:11816–11825, 2019.
  4. A data-free approach to mitigate catastrophic forgetting in federated class incremental learning for vision tasks. Advances in Neural Information Processing Systems, 36, 2024.
  5. Federated orthogonal training: Mitigating global catastrophic forgetting in continual federated learning. In International Conference on Learning Representations, 2024.
  6. Practical secure aggregation for privacy-preserving machine learning. In proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp.  1175–1191, 2017.
  7. Dark experience for general continual learning: a strong, simple baseline. Advances in Neural Information Processing Systems, 33:15920–15930, 2020.
  8. Co22{}^{\mbox{2}}start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPTL: Contrastive continual learning. In Proceedings of the IEEE international conference on computer vision, pp.  9516–9525, 2021.
  9. Riemannian walk for incremental learning: Understanding forgetting and intransigence. In Proceedings of the European Conference on Computer Vision, pp.  532–547, 2018.
  10. Efficient lifelong learning with A-GEM. In International Conference on Learning Representations, 2019.
  11. Fedhealth: A federated transfer learning framework for wearable healthcare. IEEE Intelligent Systems, 35(4):83–93, 2020.
  12. Lifelong machine learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, 12(3):1–207, 2018.
  13. Online federated learning based object detection across autonomous vehicles in a virtual world. In IEEE Consumer Communications & Networking Conference, pp.  919–920, 2023.
  14. Federated class-incremental learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.  10164–10173, 2022.
  15. CARLA: An open urban driving simulator. In Conference on robot learning, volume 78, pp.  1–16, 2017.
  16. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2020.
  17. Federated learning in vehicular networks. In 2022 IEEE International Mediterranean Conference on Communications and Networking (MeditCom), pp.  72–77. IEEE, 2022.
  18. Orthogonal gradient descent for continual learning. In International Conference on Artificial Intelligence and Statistics, pp.  3762–3773, 2020.
  19. PathNet: Evolution channels gradient descent in super neural networks. arXiv preprint arXiv:1701.08734, 2017.
  20. Robert M French. Catastrophic forgetting in connectionist networks. Trends in cognitive sciences, 3(4):128–135, 1999.
  21. Differentially private federated learning: A client level perspective. arXiv preprint arXiv:1712.07557, 2017.
  22. An empirical investigation of catastrophic forgetting in gradient-based neural networks. arXiv preprint arXiv:1312.6211, 2013.
  23. Measuring the effects of non-identical data distribution for federated visual classification. arXiv preprint arXiv:1909.06335, 2019.
  24. Advances and open problems in federated learning. Foundations and Trends® in Machine Learning, 14:1–210, 2021.
  25. SCAFFOLD: Stochastic controlled averaging for federated learning. In International Conference on Machine Learning, volume 119, pp.  5132–5143, 2020.
  26. Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences, 114:3521–3526, 2017.
  27. Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492, 8, 2016.
  28. Fair resource allocation in federated learning. In International Conference on Learning Representations, 2019.
  29. Federated optimization in heterogeneous networks. Proceedings of Machine learning and systems, 2:429–450, 2020.
  30. Learning without forgetting. IEEE transactions on pattern analysis and machine intelligence, 2017.
  31. Federated learning in mobile edge networks: A comprehensive survey. IEEE Communications Surveys & Tutorials, 2020.
  32. Gradient episodic memory for continual learning. Advances in Neural Information Processing Systems, 30:6467–6476, 2017.
  33. Continual federated learning based on knowledge distillation. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, volume 3, pp.  2182–2188, 2022.
  34. Online continual learning in image classification: An empirical survey. Neurocomputing, 469:28–51, 2022.
  35. PackNet: Adding multiple tasks to a single network by iterative pruning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.  7765–7773, 2018.
  36. Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of learning and motivation, volume 24, pp.  109–165. Elsevier, 1989.
  37. Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, volume 54, pp.  1273–1282, 2017.
  38. An empirical investigation of the role of pre-training in lifelong learning. J. Mach. Learn. Res., 24:214:1–214:50, 2023.
  39. Agnostic federated learning. In International Conference on Machine Learning, volume 97, pp.  4615–4625, 2019.
  40. Continual lifelong learning with neural networks: A review. Neural networks, 113:54–71, 2019.
  41. DualNet: Continual learning, fast and slow. Advances in Neural Information Processing Systems, 34:16131–16144, 2021.
  42. GDumb: A simple approach that questions our progress in continual learning. In Proceedings of the European Conference on Computer Vision, pp.  524–540, 2020.
  43. Roger Ratcliff. Connectionist models of recognition memory: constraints imposed by learning and forgetting functions. Psychological review, 97(2):285, 1990.
  44. iCaRL: Incremental classifier and representation learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.  2001–2010, 2017.
  45. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.  7263–7271, 2017.
  46. OpenFL: An open-source framework for federated learning. arXiv preprint arXiv:2105.06413, 2021.
  47. Anthony Robins. Catastrophic forgetting, rehearsal and pseudorehearsal. Connection Science, 7(2):123–146, 1995.
  48. Progressive neural networks. arXiv preprint arXiv:1606.04671, 2016.
  49. Gradient projection memory for continual learning. In International Conference on Learning Representations, 2020.
  50. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108, 2019.
  51. Overcoming catastrophic forgetting with hard attention to the task. In International Conference on Machine Learning, volume 80, pp.  4548–4557, 2018.
  52. Applications of federated learning; taxonomy, challenges, and research trends. Electronics, 11(4):670, 2022.
  53. Continual learning with deep generative replay. Advances in Neural Information Processing Systems, 30:2990–2999, 2017.
  54. Incremental learning of object detectors without catastrophic forgetting. In Proceedings of the IEEE international conference on computer vision, pp.  3400–3409, 2017.
  55. Overcoming forgetting in federated learning on non-IID data. arXiv preprint arXiv:1910.07796, 2019.
  56. Sebastian Thrun. Is learning the n-th thing any easier than learning the first? Advances in Neural Information Processing Systems, 8:640–646, 1995.
  57. Addressing client drift in federated continual learning with adaptive optimization. arXiv preprint arXiv:2203.13321, 2022.
  58. Jeffrey S Vitter. Random sampling with a reservoir. ACM Transactions on Mathematical Software, 11:37–57, 1985.
  59. Learning to prompt for continual learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.  139–149, 2022.
  60. Supermasks in superposition. Advances in Neural Information Processing Systems, 33:15173–15184, 2020.
  61. Large scale incremental learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.  374–382, 2019.
  62. Continual local training for better initialization of federated models. In IEEE International Conference on Image Processing, pp.  1736–1740, 2020.
  63. Lifelong learning with dynamically expandable networks. In International Conference on Learning Representations, 2018.
  64. Federated continual learning with weighted inter-client transfer. In International Conference on Machine Learning, volume 139, pp.  12073–12086, 2021.
  65. Continual learning through synaptic intelligence. In International Conference on Machine Learning, volume 70, pp.  3987–3995, 2017.
  66. Character-level convolutional networks for text classification. Advances in Neural Information Processing Systems, 28:649–657, 2015.
  67. Nested hierarchical transformer: Towards accurate, data-efficient and interpretable visual understanding. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pp.  3417–3425, 2022.
  68. Federated learning with non-IID data. arXiv preprint arXiv:1806.00582, 2018.
Citations (2)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.