Papers
Topics
Authors
Recent
Search
2000 character limit reached

Fairness-Aware Structured Pruning in Transformers

Published 24 Dec 2023 in cs.CL, cs.CY, and cs.LG | (2312.15398v1)

Abstract: The increasing size of LLMs has introduced challenges in their training and inference. Removing model components is perceived as a solution to tackle the large model sizes, however, existing pruning methods solely focus on performance, without considering an essential aspect for the responsible use of LLMs: model fairness. It is crucial to address the fairness of LLMs towards diverse groups, such as women, Black people, LGBTQ+, Jewish communities, among others, as they are being deployed and available to a wide audience. In this work, first, we investigate how attention heads impact fairness and performance in pre-trained transformer-based LLMs. We then propose a novel method to prune the attention heads that negatively impact fairness while retaining the heads critical for performance, i.e. language modeling capabilities. Our approach is practical in terms of time and resources, as it does not require fine-tuning the final pruned, and fairer, model. Our findings demonstrate a reduction in gender bias by 19%, 19.5%, 39.5%, 34.7%, 23%, and 8% for DistilGPT-2, GPT-2, GPT-Neo of two different sizes, GPT-J, and Llama 2 models, respectively, in comparison to the biased model, with only a slight decrease in performance.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (58)
  1. Losing Heads in the Lottery: Pruning Transformer Attention in Neural Machine Translation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2664–2674. Online: Association for Computational Linguistics.
  2. Pruning Neural Machine Translation for Speed Using Group Lasso. In Proceedings of the Sixth Conference on Machine Translation, 1074–1086. Online: Association for Computational Linguistics.
  3. Pruning neural machine translation for speed using group lasso. In Proceedings of the sixth conference on machine translation, 1074–1086.
  4. On Attention Redundancy: A Comprehensive Study. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 930–945. Online: Association for Computational Linguistics.
  5. GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. If you use this software, please cite it using these metadata.
  6. Stereotyping Norwegian salmon: An inventory of pitfalls in fairness benchmark datasets. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 1004–1015.
  7. Language models are few-shot learners. Advances in neural information processing systems, 33: 1877–1901.
  8. Semantics Derived Automatically from Language Corpora Contain Human-Like Biases. Science, 356(6334): 183–186.
  9. On the Intrinsic and Extrinsic Fairness Evaluation Metrics for Contextualized Language Representations. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 561–570. Dublin, Ireland: Association for Computational Linguistics.
  10. LaMDA: Language models for dialog applications.
  11. Measuring Fairness with Biased Rulers: A Comparative Study on Bias Metrics for Pre-trained Language Models. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1693–1706. Seattle, United States: Association for Computational Linguistics.
  12. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL, 4171–4186.
  13. Bold: Dataset and metrics for measuring biases in open-ended language generation. In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency, 862–872.
  14. Measuring and mitigating unintended bias in text classification. In Conference on AI, Ethics, and Society.
  15. Reducing Transformer Depth on Demand with Structured Dropout. In International Conference on Learning Representations.
  16. Layer-wise Model Pruning based on Mutual Information. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 3079–3090. Online and Punta Cana, Dominican Republic: Association for Computational Linguistics.
  17. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks. In International Conference on Learning Representations.
  18. Detecting emergent intersectional biases: Contextualized word embeddings contain a distribution of human-like biases. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, 122–133.
  19. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149.
  20. Learning both weights and connections for efficient neural network. Advances in neural information processing systems, 28.
  21. The Stem Cell Hypothesis: Dilemma behind Multi-Task Learning with Transformer Encoders. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 5555–5577. Online and Punta Cana, Dominican Republic: Association for Computational Linguistics.
  22. Training compute-optimal large language models. arXiv preprint arXiv:2203.15556.
  23. Measuring Bias in Contextualized Word Representations. In Proceedings of the First Workshop on Gender Bias in Natural Language Processing, 166–172.
  24. Collecting a Large-Scale Gender Bias Dataset for Coreference Resolution and Machine Translation. In Findings of the Association for Computational Linguistics: EMNLP 2021, 2470–2480.
  25. A Unified MRC Framework for Named Entity Recognition. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 5849–5859. Online: Association for Computational Linguistics.
  26. Dice Loss for Data-imbalanced NLP Tasks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 465–476. Online: Association for Computational Linguistics.
  27. Jumainrassic-1: Technical details and evaluation.
  28. BRIO: Bringing Order to Abstractive Summarization. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2890–2903. Dublin, Ireland: Association for Computational Linguistics.
  29. On Measuring Social Biases in Sentence Encoders. In Conference of the North American Chapter of the Association for Computational Linguistics.
  30. An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models. In Annual Meeting of the Association for Computational Linguistics.
  31. Pointer Sentinel Mixture Models. In ICLR.
  32. Are Sixteen Heads Really Better than One? In Wallach, H.; Larochelle, H.; Beygelzimer, A.; d'Alché-Buc, F.; Fox, E.; and Garnett, R., eds., Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc.
  33. StereoSet: Measuring stereotypical bias in pretrained language models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 5356–5371.
  34. CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1953–1967.
  35. Exploring Sparsity in Recurrent Neural Networks. In International Conference on Learning Representations.
  36. When BERT Plays the Lottery, All Tickets Are Winning. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 3208–3229. Online: Association for Computational Linguistics.
  37. Language Models are Unsupervised Multitask Learners. OpenAI Blog, 1(8): 9.
  38. Scaling language models: Methods, analysis & insights from training gopher. arXiv preprint arXiv:2112.11446.
  39. Know What You Don’t Know: Unanswerable Questions for SQuAD. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 784–789. Melbourne, Australia: Association for Computational Linguistics.
  40. SQuAD: 100,000+ Questions for Machine Comprehension of Text. In Conference on Empirical Methods in Natural Language Processing.
  41. Model compression for domain adaptation through causal effect estimation. Transactions of the Association for Computational Linguistics, 9: 1355–1373.
  42. Gender Bias in Coreference Resolution. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), 8–14.
  43. On the effect of dropping layers of pre-trained transformer models. Computer Speech & Language, 77: 101429.
  44. “I’m sorry to hear that”: Finding New Biases in Language Models with a Holistic Descriptor Dataset. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 9180–9211.
  45. Using deepspeed and megatron to train megatron-turing nlg 530b, a large-scale generative language model. arXiv preprint arXiv:2201.11990.
  46. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
  47. Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 5797–5808. Florence, Italy: Association for Computational Linguistics.
  48. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. In EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP.
  49. GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model. https://github.com/kingoflolz/mesh-transformer-jax.
  50. Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small. In NeurIPS ML Safety Workshop.
  51. Measuring and reducing gendered correlations in pre-trained models. arXiv preprint arXiv:2010.06032.
  52. Named Entity Recognition as Dependency Parsing. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 6470–6476. Online: Association for Computational Linguistics.
  53. Should We Attend More or Less? Modulating Attention for Fairness. arXiv preprint arXiv:2305.13088.
  54. Deep Learning on a Healthy Data Diet: Finding Important Examples for Fairness. In AAAI Conference on Artificial Intelligence.
  55. Enlivening Redundant Heads in Multi-head Self-attention for Machine Translation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 3238–3248. Online and Punta Cana, Dominican Republic: Association for Computational Linguistics.
  56. Fast and Accurate Neural CRF Constituency Parsing. In Bessiere, C., ed., Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-20, 4046–4053. International Joint Conferences on Artificial Intelligence Organization. Main track.
  57. Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), 15–20. New Orleans, Louisiana: Association for Computational Linguistics.
  58. To Prune, or Not to Prune: Exploring the Efficacy of Pruning for Model Compression.
Citations (11)

Summary

  • The paper introduces FASP, a method that targets bias-inducing attention heads to reduce model biases without significant loss in language modeling performance.
  • It quantifies the impact of each attention head on bias metrics and demonstrates substantial gender bias reduction in models like GPT-2.
  • The research underscores the value of fairness in model optimization, paving the way for broader applications and further exploration of bias mitigation techniques.

Fairness-Aware Structured Pruning in Transformers

The paper "Fairness-Aware Structured Pruning in Transformers" critically examines the challenge of balancing fairness and performance in LLMs through structured pruning. The study addresses a key limitation in current pruning methodologies, which predominantly focus on maximizing efficiency without adequately considering the implications on model fairness. This oversight can lead to unfair representations and outputs, particularly impacting underrepresented communities such as women, Black people, and LGBTQ+ individuals.

Main Contributions

The authors present a novel pruning approach aimed at enhancing fairness while maintaining performance, focusing specifically on pre-trained transformer-based models. Key contributions of this work are:

  1. Investigation of Pruning Impact: The paper explores the consequences of existing attention head pruning strategies on biases present across different LLMs, revealing that these methods often fail to improve fairness.
  2. Quantification of Bias: An innovative method is proposed to quantify the bias impact of attention heads. This involves evaluating the change in bias metrics before and after the removal of an attention head, thereby determining its contribution to overall model bias.
  3. Fairness-Aware Pruning: The study introduces a structured pruning method that intentionally targets heads contributing to bias, only providing they are not critical for language modeling accuracy. This dual-focus methodology is called Fairness-Aware Structured Pruning (FASP).
  4. Empirical Evaluation: Experimental validation showcases FASP's robustness, achieving significant reductions in gender bias across multiple transformer models, including DistilGPT-2 and GPT-2, without substantial degradation in performance as measured by language modeling perplexity.
  5. Effect on Various Biases: The paper also explores how targeting gender bias affects other biases relating to religion, race, sexual orientation, and nationality. It observes a correlation between reductions in gender bias and other social biases, highlighting the interconnected nature of these biases within model architectures.

Implications and Future Directions

The implications of this research are substantial for the responsible deployment of LLMs in sensitive applications, suggesting that bias mitigation can be effectively integrated into model optimization processes. The methodology introduces a practical strategy to enhance model fairness without resorting to costly retraining processes.

Looking forward, the work opens avenues for further research into:

  • Generalization of the proposed pruning approach across various architectures and domains.
  • Development of comprehensive fairness metrics encompassing a broader range of biases.
  • Exploration of real-world applications to validate the efficacy of the approach in practice.

Moreover, the positive correlation observed between different social biases implies an opportunity to develop unified strategies for bias mitigation, extending beyond the scope of gender bias alone. Additionally, the structured pruning approach could inform future efforts in developing fair AI models that are both efficient and equitable, thus contributing to fairer AI systems on a broader scale.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 51 likes about this paper.