Forcing Diffuse Distributions out of Language Models
Abstract: Despite being trained specifically to follow user instructions, today's instructiontuned LLMs perform poorly when instructed to produce random outputs. For example, when prompted to pick a number uniformly between one and ten Llama-2-13B-chat disproportionately favors the number five, and when tasked with picking a first name at random, Mistral-7B-Instruct chooses Avery 40 times more often than we would expect based on the U.S. population. When these LLMs are used for real-world tasks where diversity of outputs is crucial, such as LLM assisted dataset construction, their inability to produce diffuse distributions over valid choices is a major hurdle. In this work, we propose a fine-tuning method that encourages LLMs to output distributions that are diffuse over valid outcomes. The methods we introduce generalize across a variety of tasks and distributions and make LLMs practical for synthetic dataset generation with little human intervention.
- Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, pp. 4356–4364, Red Hook, NY, USA, December 2016. Curran Associates Inc. ISBN 978-1-5108-3881-9.
- RelationPrompt: Leveraging Prompts to Generate Synthetic Data for Zero-Shot Relation Triplet Extraction. In Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (eds.), Findings of the Association for Computational Linguistics: ACL 2022, pp. 45–57, Dublin, Ireland, May 2022. Association for Computational Linguistics.
- Classical Structured Prediction Losses for Sequence to Sequence Learning. In Marilyn Walker, Heng Ji, and Amanda Stent (eds.), Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 355–364, New Orleans, Louisiana, June 2018. Association for Computational Linguistics.
- Hierarchical Neural Story Generation, May 2018.
- Google. Gemma: Introducing new state-of-the-art open models. https://blog.google/technology/developers/gemma-open-models/, February 2024.
- The Curious Case of Neural Text Degeneration, February 2020.
- Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor, December 2022.
- Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685, 2021.
- Mistral 7B, October 2023.
- A Simple, Fast Diverse Decoding Algorithm for Neural Generation, December 2016.
- Prefix-Tuning: Optimizing Continuous Prompts for Generation. In Chengqing Zong, Fei Xia, Wenjie Li, and Roberto Navigli (eds.), Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 4582–4597, Online, August 2021. Association for Computational Linguistics.
- Towards Understanding and Mitigating Social Biases in Language Models. In Proceedings of the 38th International Conference on Machine Learning, pp. 6565–6576. PMLR, July 2021.
- WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation, November 2022.
- TOFU: A Task of Fictitious Unlearning for LLMs, January 2024.
- A Kernel-Based View of Language Model Fine-Tuning, June 2023.
- Generating Training Data with Language Models: Towards Zero-Shot Language Understanding, October 2022.
- Generating Datasets with Pretrained Language Models, October 2021.
- Evaluating the Evaluation of Diversity in Natural Language Generation. In Paola Merlo, Jorg Tiedemann, and Reut Tsarfaty (eds.), Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 326–346, Online, April 2021. Association for Computational Linguistics.
- Llama 2: Open Foundation and Fine-Tuned Chat Models, July 2023.
- Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models, October 2018.
- Self-instruct: Aligning language models with self-generated instructions. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 13484–13508, 2023.
- Neural Text Generation with Unlikelihood Training, September 2019.
- ProGen: Progressive Zero-shot Dataset Generation via In-context Feedback, October 2022a.
- ZeroGen: Efficient Zero-shot Learning via Dataset Generation, October 2022b.
- Synthbio: A case study in faster curation of text datasets. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2021.
- Trading Off Diversity and Quality in Natural Language Generation, April 2020.
- Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena, December 2023.
- Universal and transferable adversarial attacks on aligned language models. arXiv preprint arXiv:2307.15043, 2023.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.