Unmasking Efficiency: Learning Salient Sparse Models in Non-IID Federated Learning
Abstract: In this work, we propose Salient Sparse Federated Learning (SSFL), a streamlined approach for sparse federated learning with efficient communication. SSFL identifies a sparse subnetwork prior to training, leveraging parameter saliency scores computed separately on local client data in non-IID scenarios, and then aggregated, to determine a global mask. Only the sparse model weights are communicated each round between the clients and the server. We validate SSFL's effectiveness using standard non-IID benchmarks, noting marked improvements in the sparsity--accuracy trade-offs. Finally, we deploy our method in a real-world federated learning framework and report improvement in communication time.
- Single-shot pruning for offline reinforcement learning. arXiv preprint arXiv:2112.15579, 2021.
- Stronger generalization bounds for deep nets via a compression approach. In International Conference on Machine Learning, pages 254–263. PMLR, 2018.
- Federated sparse training: Lottery aware model compression for resource constrained edge. arXiv preprint arXiv:2208.13092, 2022.
- rtop-k: A statistical estimation approach to distributed sgd. IEEE Journal on Selected Areas in Information Theory, 1(3):897–907, 2020.
- Federated dynamic sparse training: Computing less, communicating less, yet learning better. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 6080–6088, 2022.
- Towards federated learning at scale: System design. Proceedings of machine learning and systems, 1:374–388, 2019.
- Cerebras. Wafer Scale Engine: Why We Need Big Chips for Deep Learning. https://cerebras.net/blog/cerebras-wafer-scale-engine-why-we-need-big-chips-for-deep-learning/, 2019.
- The lottery ticket hypothesis for pre-trained bert networks. Advances in neural information processing systems, 33:15834–15846, 2020.
- Fine-tuning is fine in federated learning. arXiv preprint arXiv:2108.07313, 2021.
- Dispfl: Towards communication-efficient personalized federated learning via decentralized sparse training. arXiv preprint arXiv:2206.00187, 2022.
- Progressive skeletonization: Trimming more fat from a network at initialization. arXiv preprint arXiv:2006.09081, 2020.
- Fast sparse convnets. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 14629–14638, 2020.
- Rigging the lottery: Making all tickets winners. In International Conference on Machine Learning, pages 2943–2952. PMLR, 2020.
- Fate-llm: A industrial grade federated learning framework for large language models. arXiv preprint arXiv:2310.10049, 2023.
- The lottery ticket hypothesis: Finding sparse, trainable neural networks. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019, 2019.
- Pruning neural networks at initialization: Why are we missing the mark? In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021, 2021.
- Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149, 2015.
- Measuring the effects of non-identical data distribution for federated visual classification. arXiv preprint arXiv:1909.06335, 2019.
- Fedtiny: Pruned federated learning towards specialized tiny models. arXiv preprint arXiv:2212.01977, 2022.
- Sparse random networks for communication-efficient federated learning. arXiv preprint arXiv:2209.15328, 2022.
- Steven A Janowsky. Pruning versus clipping in neural networks. Physical Review A, 39(12):6600, 1989.
- Model pruning enables efficient federated learning on edge devices. IEEE Transactions on Neural Networks and Learning Systems, 2022.
- Advances and open problems in federated learning. Foundations and Trends® in Machine Learning, 14(1–2):1–210, 2021.
- Federated optimization: Distributed machine learning for on-device intelligence. arXiv preprint arXiv:1610.02527, 2016.
- Learning multiple layers of features from tiny images. University of Toronto, 2009.
- Survey of personalization techniques for federated learning. In 2020 Fourth World Conference on Smart Trends in Systems, Security and Sustainability (WorldS4), pages 794–797. IEEE, 2020.
- Optimal brain damage. In Advances in neural information processing systems, pages 598–605, 1990.
- Snip: Single-shot network pruning based on connection sensitivity. arXiv preprint arXiv:1810.02340, 2018.
- Lotteryfl: Personalized and communication-efficient federated learning with lottery ticket hypothesis on non-iid datasets. arXiv preprint arXiv:2008.03371, 2020a.
- Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710, 2016.
- Federated learning: Challenges, methods, and future directions. IEEE signal processing magazine, 37(3):50–60, 2020b.
- Ditto: Fair and robust federated learning through personalization. In International Conference on Machine Learning, pages 6357–6368. PMLR, 2021.
- Can decentralized algorithms outperform centralized algorithms? a case study for decentralized parallel stochastic gradient descent. Advances in Neural Information Processing Systems, 2017.
- Deep gradient compression: Reducing the communication bandwidth for distributed training. arXiv preprint arXiv:1712.01887, 2017.
- Fedprune: personalized and communication-efficient federated learning on non-iid data. In Neural Information Processing: 28th International Conference, ICONIP 2021, Sanur, Bali, Indonesia, December 8–12, 2021, Proceedings, Part V 28, pages 430–437. Springer, 2021.
- Transformed ℓ1subscriptℓ1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT regularization for learning sparse deep neural networks. Neural Networks, 119:286–298, 2019.
- Federated learning: Collaborative machine learning without centralized training data, 2017. URL https://ai.googleblog.com/2017/04/federated-learning-collaborative.html.
- Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, pages 1273–1282. PMLR, 2017.
- Communication-efficient learning of deep networks from decentralized data. In International Conference on Artificial Intelligence and Statistics, 2016.
- Skeletonization: A technique for trimming the fat from a network via relevance assessment. Advances in neural information processing systems, 1, 1988.
- Fedprune: Towards inclusive federated learning. arXiv preprint arXiv:2110.14205, 2021.
- Explicit group sparse projection with applications to deep learning and NMF. Transactions on Machine Learning Research, 2022. ISSN 2835-8856. URL https://openreview.net/forum?id=jIrOeWjdpc.
- COINSTAC: A privacy enabled model and prototype for leveraging and processing decentralized brain imaging data. Frontiers in Neuroscience, 10, August 2016. doi: 10.3389/fnins.2016.00365. URL https://doi.org/10.3389/fnins.2016.00365.
- Accelerating Inference with Sparsity Using the NVIDIA Ampere Architecture and NVIDIA TensorRT. https://developer.nvidia.com/blog/accelerating-inference-with-sparsity-using-ampere-and-tensorrt/, 2021.
- Russell Reed. Pruning algorithms-a survey. IEEE transactions on Neural Networks, 4(5):740–747, 1993.
- Comparing rewinding and fine-tuning in neural network pruning. arXiv preprint arXiv:2003.02389, 2020.
- Communication-efficient and personalized federated lottery ticket learning. In 2021 IEEE 22nd International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), pages 581–585. IEEE, 2021.
- Perfedmask: Personalized federated learning with optimized masking vectors. In The Eleventh International Conference on Learning Representations, 2022.
- When to prune? a policy towards early structural pruning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12247–12256, 2022.
- Dynamic sparse training for deep reinforcement learning. arXiv preprint arXiv:2106.04217, 2021.
- Pruning neural networks without any data by iteratively conserving synaptic flow. In Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin, editors, Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020.
- Personalized federated learning by structured and unstructured pruning under data heterogeneity. arXiv preprint arXiv:2105.00562, 2021.
- Picking winning tickets before training by preserving gradient flow. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020, 2020.
- Group normalization. In Proceedings of the European conference on computer vision (ECCV), pages 3–19, 2018.
- Deephoyer: Learning sparser neural network with differentiable scale-invariant sparsity measures. arXiv preprint arXiv:1908.09979, 2019.
- Adaptive dynamic pruning for non-iid federated learning. arXiv preprint arXiv:2106.06921, 2021.
- Bridging the gap between foundation models and heterogeneous federated learning. arXiv preprint arXiv:2310.00247, 2023.
- Personalized federated learning with first order model optimization. arXiv preprint arXiv:2012.08565, 2020.
- Federated learning with non-iid data. arXiv preprint arXiv:1806.00582, 2018.
- Federated learning on non-iid data: A survey. Neurocomputing, 465:371–390, 2021.
- To prune, or not to prune: Exploring the efficacy of pruning for model compression. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Workshop Track Proceedings, 2018.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.