Papers
Topics
Authors
Recent
Search
2000 character limit reached

LabObf: A Label Protection Scheme for Vertical Federated Learning Through Label Obfuscation

Published 27 May 2024 in cs.LG and cs.CR | (2405.17042v2)

Abstract: Split Neural Network, as one of the most common architectures used in vertical federated learning, is popular in industry due to its privacy-preserving characteristics. In this architecture, the party holding the labels seeks cooperation from other parties to improve model performance due to insufficient feature data. Each of these participants has a self-defined bottom model to learn hidden representations from its own feature data and uploads the embedding vectors to the top model held by the label holder for final predictions. This design allows participants to conduct joint training without directly exchanging data. However, existing research points out that malicious participants may still infer label information from the uploaded embeddings, leading to privacy leakage. In this paper, we first propose an embedding extension attack manipulating embeddings to undermine existing defense strategies, which rely on constraining the correlation between the embeddings uploaded by participants and the labels. Subsequently, we propose a new label obfuscation defense strategy, called `LabObf', which randomly maps each original integer-valued label to multiple real-valued soft labels with values intertwined, significantly increasing the difficulty for attackers to infer the labels. We conduct experiments on four different types of datasets, and the results show that LabObf significantly reduces the attacker's success rate compared to raw models while maintaining desirable model accuracy.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (25)
  1. F. Tang, S. Liang, G. Ling, and J. Shan, “Ihvfl: a privacy-enhanced intention-hiding vertical federated learning framework for medical data,” Cybersecurity, vol. 6, no. 1, p. 37, 2023.
  2. “Webank,” https://www.webank.com, 2014.
  3. A. Fu, X. Zhang, N. Xiong, Y. Gao, H. Wang, and J. Zhang, “Vfl: A verifiable federated learning with privacy-preserving for big data in industrial iot,” IEEE Transactions on Industrial Informatics, vol. 18, no. 5, pp. 3316–3326, 2020.
  4. P. Vepakomma, O. Gupta, T. Swedish, and R. Raskar, “Split learning for health: Distributed deep learning without sharing raw patient data,” ArXiv, vol. abs/1812.00564, 2018.
  5. “Bytedance,” https://www.bytedance.com/zh, 2012.
  6. “Tencent,” https://www.tencent.com/zh-cn/index.html, 1998.
  7. X. Yang, J. Sun, Y. Yao, J. Xie, and C. Wang, “Differentially private label protection in split learning,” arXiv preprint arXiv:2203.02073, 2022.
  8. T. Zou, Y. Liu, Y. Kang, W. Liu, Y. He, Z. Yi, Q. Yang, and Y.-Q. Zhang, “Defending batch-level label inference and replacement attacks in vertical federated learning,” IEEE Transactions on Big Data, pp. 1–12, 2022.
  9. C. Fu, X. Zhang, S. Ji, J. Chen, J. Wu, S. Guo, J. Zhou, A. X. Liu, and T. Wang, “Label inference attacks against vertical federated learning,” in 31st USENIX Security Symposium (USENIX Security 22).   Boston, MA: USENIX Association, Aug. 2022. [Online]. Available: https://www.usenix.org/conference/usenixsecurity22/presentation/fu-chong
  10. D. Pasquini, G. Ateniese, and M. Bernaschi, “Unleashing the tiger: Inference attacks on split learning,” in Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, ser. CCS ’21.   New York, NY, USA: Association for Computing Machinery, 2021, p. 2113–2129. [Online]. Available: https://doi.org/10.1145/3460120.3485259
  11. O. Li, J. Sun, X. Yang, W. Gao, H. Zhang, J. Xie, V. Smith, and C. Wang, “Label leakage and protection in two-party split learning,” ArXiv, vol. abs/2102.08504, 2021.
  12. H. Gu, J. Luo, Y. Kang, L. Fan, and Q. Yang, “Fedpass: privacy-preserving vertical federated deep learning with adaptive obfuscation,” arXiv preprint arXiv:2301.12623, 2023.
  13. K. Fan, J. Hong, W. Li, X. Zhao, H. Li, and Y. Yang, “Flsg: A novel defense strategy against inference attacks in vertical federated learning,” IEEE Internet of Things Journal, 2023.
  14. J. Sun, X. Yang, Y. Yao, and C. Wang, “Label leakage and protection from forward embedding in vertical federated learning,” 2022. [Online]. Available: https://arxiv.org/abs/2203.01451
  15. “Epsilon dataset,” [Online]. Available: https://catboost.ai/en/docs/concepts/python-reference_datasets_epsilon, 2008.
  16. S. Moro, P. Rita, and P. Cortez, “Bank Marketing,” UCI Machine Learning Repository, 2012, DOI: https://doi.org/10.24432/C5K306.
  17. J. Blackard, “Covertype,” UCI Machine Learning Repository, 1998, DOI: https://doi.org/10.24432/C50K5N.
  18. “Faulttype dataset,” [Online]. Available: https://www.kaggle.com/datasets/guanlintao/classification-of-faults-dataset, 2023.
  19. Q. Yang, Y. Liu, T. Chen, and Y. Tong, “Federated machine learning: Concept and applications,” arXiv: Artificial Intelligence, 2019.
  20. D. Pasquini, G. Ateniese, and M. Bernaschi, “Unleashing the tiger: Inference attacks on split learning,” in Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, 2021, pp. 2113–2129.
  21. O. Li, J. Sun, X. Yang, W. Gao, H. Zhang, J. Xie, V. Smith, and C. Wang, “Label leakage and protection in two-party split learning,” 2021. [Online]. Available: https://arxiv.org/abs/2102.08504
  22. L. Van der Maaten and G. Hinton, “Visualizing data using t-sne.” Journal of machine learning research, vol. 9, no. 11, 2008.
  23. D. Gao, S. Wan, L. Fan, X. Yao, and Q. Yang, “Complementary knowledge distillation for robust and privacy-preserving model serving in vertical federated learning,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 18, 2024, pp. 19 832–19 839.
  24. Y. Wang, Q. Lv, H. Zhang, M. Zhao, Y. Sun, L. Ran, and T. Li, “Beyond model splitting: Preventing label inference attacks in vertical federated learning with dispersed training,” World Wide Web, vol. 26, no. 5, pp. 2691–2707, 2023.
  25. B. Ghazi, N. Golowich, R. Kumar, P. Manurangsi, and C. Zhang, “Deep learning with label differential privacy,” Advances in neural information processing systems, vol. 34, pp. 27 131–27 145, 2021.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.