Papers
Topics
Authors
Recent
Search
2000 character limit reached

Evaluating Large Language Models through Gender and Racial Stereotypes

Published 24 Nov 2023 in cs.CL, cs.AI, and cs.CY | (2311.14788v1)

Abstract: LLMs have ushered a new age of AI gaining traction within the NLP community as well as amongst the general population. AI's ability to make predictions, generations and its applications in sensitive decision-making scenarios, makes it even more important to study these models for possible biases that may exist and that can be exaggerated. We conduct a quality comparative study and establish a framework to evaluate LLMs under the premise of two kinds of biases: gender and race, in a professional setting. We find out that while gender bias has reduced immensely in newer models, as compared to older ones, racial bias still exists.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (15)
  1. Unmasking Contextual Stereotypes: Measuring and Mitigating BERT’s Gender Bias. CoRR abs/2010.14534 (2020). arXiv:2010.14534 https://arxiv.org/abs/2010.14534
  2. Language Models are Few-Shot Learners. arXiv:2005.14165 [cs.CL]
  3. Semantics derived automatically from language corpora contain human-like biases. Science 356, 6334 (14 April 2017), 183–186. https://doi.org/10.1126/science.aal4230
  4. Scaling Instruction-Finetuned Language Models. arXiv:2210.11416 [cs.LG]
  5. Devah and Shepherd. 1997. Racial, societal and class bias in clinical judgement. arXiv:https://doi.org/10.1111/j.1468-2850.1997.tb00104.x
  6. BERT: Pre-training of deep bidirectional transformers for language understanding. In NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, Vol. 1. 4171–4186. www.scopus.com Cited By :6289.
  7. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv:1907.11692 [cs.CL]
  8. Science faculty’s subtle gender biases favor male students. Proceedings of the National Academy of Sciences 109, 41 (2012), 16474–16479. https://doi.org/10.1073/pnas.1211286109 arXiv:https://www.pnas.org/doi/pdf/10.1073/pnas.1211286109
  9. Muhammad Ali Pervez. 2010. Impact of Emotions on Employee’s Job Performance: An Evidence from Organizations of Pakistan. arXiv:https://ssrn.com/abstract=1668170
  10. Language Models are Unsupervised Multitask Learners.
  11. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. arXiv:1910.10683 [cs.LG]
  12. Exploring Perception Of Professionals Regarding Introversion And Extroversion In Relation To Success At Workplace. 7 (01 2021).
  13. Hanvold T.N. Sterud T. 2021. Effects of adverse social behaviour at the workplace on subsequent mental distress: a 3-year prospective study of the general working population in Norway. arXiv:https://doi.org/10.1007/s00420-020-01581-y
  14. LaMDA: Language Models for Dialog Applications. arXiv:2201.08239 [cs.CL]
  15. XLNet: Generalized Autoregressive Pretraining for Language Understanding. arXiv:1906.08237 [cs.CL]
Citations (2)

Summary

  • The paper evaluates LLM performance in gender assignment using bias scores, finding that GPT-3.5 exhibits the least bias among evaluated models.
  • It assesses racial bias by analyzing AI-generated descriptions with LIWC, uncovering persistent stereotypes across various professions.
  • The study emphasizes that mitigating biases in LLMs is crucial for ensuring fair, unbiased decision-making in sensitive professional contexts.

Introduction

LLMs have become integral to various applications in sensitive decision-making scenarios, making the presence of biases within them a significant concern. Two primary biases evaluated in this research are gender and race within a professional context. These biases could potentially affect outcomes and perpetuate societal stereotypes if not addressed effectively. The study utilizes a dataset of 99 professions to assess whether models exhibit biases when assigning gender or race to these professions. While gender bias appears to be on the decline, racial bias still persists in LLMs.

Methodology

The study employs a two-pronged approach: one for gender and another for racial bias. Gender bias is tested by tasking models with assigning a gender to different professions, comparing the results against human-annotated ground truth. The evaluation covers both older models (like BERT, GPT-2) and newer ones (like GPT-3.5 and Claude). Racial bias is assessed by generating descriptions for individuals of various races in different professions and analyzing the responses for stereotypes. The study operationalizes societal biases as varied accuracies in judgment based on gender, race, and social status.

Gender Analysis

Investigating gender bias, the study finds that newer models like GPT-3.5 show improvement over older versions, with a substantial reduction in gender bias. However, challenges remain as models like Flan-T5 exhibit significant biases, failing to embrace recent shifts towards gender neutrality in professions. Metrics such as bias score are used to compare model performances, showing that GPT-3.5 exhibits the least bias among the evaluated models. The research highlights that while advancements are evident, the path to completely unbiased AI representations of gender in professions is still unfolding.

Race Analysis

In assessing racial bias, GPT-3.5 generates descriptions that adhere to stereotypes for different races across various professions. By measuring the similarity of responses and employing a Linguistic Inquiry and Word Count (LIWC) analysis, the study shows noticeable differences in the emotional, social, and work-related attributes ascribed to different races. These inconsistencies reveal implicit biases where certain races are depicted with more emotive descriptors or differing attitudes towards work and social interactions.

Conclusion

The evaluation framework developed and applied in this study demonstrates that, despite improvements, LLMs such as GPT-3.5 still exhibit biases related to gender and race. The research underlines the importance of continued efforts to mitigate these biases, suggesting that future studies could broaden the analysis to include other models and evaluate the impact of biases on human behavior more directly. The study contributes to the critical discourse on creating fairer AI systems by providing a method to identify and measure the subtle prejudices that could influence real-world decisions.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.