LLM-Assisted Content Conditional Debiasing for Fair Text Embedding
Abstract: Mitigating biases in machine learning models has become an increasing concern in NLP, particularly in developing fair text embeddings, which are crucial yet challenging for real-world applications like search engines. In response, this paper proposes a novel method for learning fair text embeddings. First, we define a novel content-conditional equal distance (CCED) fairness for text embeddings, ensuring content-conditional independence between sensitive attributes and text embeddings. Building on CCED, we introduce a content-conditional debiasing (CCD) loss to ensure that embeddings of texts with different sensitive attributes but identical content maintain the same distance from the embedding of their corresponding neutral text. Additionally, we tackle the issue of insufficient training data by using LLMs with instructions to fairly augment texts into different sensitive groups. Our extensive evaluations show that our approach effectively enhances fairness while maintaining the utility of embeddings. Furthermore, our augmented dataset, combined with the CCED metric, serves as an new benchmark for evaluating fairness.
- Ricardo Baeza-Yates. 2018. Bias on the web. Communications of the ACM, 61(6):54–61.
- The fifth pascal recognizing textual entailment challenge. TAC, 7(8):1.
- Man is to computer programmer as woman is to homemaker? debiasing word embeddings. Advances in neural information processing systems, 29.
- Language models are realistic tabular data generators. arXiv preprint arXiv:2210.06280.
- Relationprompt: Leveraging prompts to generate synthetic data for zero-shot relation triplet extraction. arXiv preprint arXiv:2203.09101.
- Sentiment analysis based on deep learning: A comparative study. Electronics, 9(3):483.
- On fairness of medical image classification with multiple sensitive attributes via learning orthogonal representations. In International Conference on Information Processing in Medical Imaging, pages 158–169. Springer.
- Queens are powerful too: Mitigating gender bias in dialogue generation. arXiv preprint arXiv:1911.03842.
- William B. Dolan and Chris Brockett. 2005. Automatically constructing a corpus of sentential paraphrases. In Proceedings of the Third International Workshop on Paraphrasing (IWP2005).
- Bias and fairness in large language models: A survey. arXiv preprint arXiv:2309.00770.
- Geoffrey E Hinton and Sam Roweis. 2002. Stochastic neighbor embedding. Advances in neural information processing systems, 15.
- Masahiro Kaneko and Danushka Bollegala. 2021. Debiasing pre-trained contextualised embeddings. arXiv preprint arXiv:2101.09523.
- Grep-biasir: A dataset for investigating gender representation bias in information retrieval results. In Proceedings of the 2023 Conference on Human Information Interaction and Retrieval, pages 444–448.
- The winograd schema challenge. In Thirteenth international conference on the principles of knowledge representation and reasoning.
- Towards debiasing sentence representations. arXiv preprint arXiv:2007.08100.
- Fairness-aware learning for continuous attributes and treatments. In International Conference on Machine Learning, pages 4382–4391. PMLR.
- On measuring social biases in sentence encoders. arXiv preprint arXiv:1903.10561.
- Generating training data with language models: Towards zero-shot language understanding. Advances in Neural Information Processing Systems, 35:462–477.
- Stereoset: Measuring stereotypical bias in pretrained language models. arXiv preprint arXiv:2004.09456.
- Crows-pairs: A challenge dataset for measuring social biases in masked language models. arXiv preprint arXiv:2010.00133.
- Fair is better than sensational: Man is to doctor as woman is to doctor. Computational Linguistics, 46(2):487–497.
- Text embedding models contain bias. here’s why that matters. Google Developers.
- Deep sentence embedding using long short-term memory networks: Analysis and application to information retrieval. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 24(4):694–707.
- Efficient conditionally invariant representation learning. arXiv preprint arXiv:2212.08645.
- Overview and discussion of the competition on legal information extraction/entailment (coliee) 2021. The Review of Socionetwork Strategies, 16(1):111–133.
- Timo Schick and Hinrich Schütze. 2021. Generating datasets with pretrained language models. arXiv preprint arXiv:2104.07540.
- Synthetic prompting: Generating chain-of-thought demonstrations for large language models. arXiv preprint arXiv:2302.00618.
- Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 1631–1642, Seattle, Washington, USA. Association for Computational Linguistics.
- Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805.
- Jörg Tiedemann. 2012. Parallel data, tools and interfaces in opus. In Lrec, volume 2012, pages 2214–2218. Citeseer.
- Toward fairness in text generation via mutual information minimization based on importance sampling. In International Conference on Artificial Intelligence and Statistics, pages 4473–4485. PMLR.
- Symmetric cross entropy for robust learning with noisy labels. In Proceedings of the IEEE/CVF international conference on computer vision, pages 322–330.
- Adept: A debiasing prompt framework. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 10780–10788.
- Zerogen: Efficient zero-shot learning via dataset generation. arXiv preprint arXiv:2202.07922.
- Large language model as attributed training data generator: A tale of diversity and bias. arXiv preprint arXiv:2306.15895.
- Mitigating bias in search results through contextual document reranking and neutrality regularization. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 2532–2538.
- Collaborative knowledge base embedding for recommender systems. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 353–362.
- Han Zhao and Geoffrey J Gordon. 2022. Inherent tradeoffs in learning fair representations. The Journal of Machine Learning Research, 23(1):2527–2552.
- Gender bias in coreference resolution: Evaluation and debiasing methods. arXiv preprint arXiv:1804.06876.
- Indre Zliobaite. 2015. On the relation between accuracy and fairness in binary classification. arXiv preprint arXiv:1505.05723.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.