Is Temperature the Creativity Parameter of Large Language Models?
Abstract: LLMs are applied to all sorts of creative tasks, and their outputs vary from beautiful, to peculiar, to pastiche, into plain plagiarism. The temperature parameter of an LLM regulates the amount of randomness, leading to more diverse outputs; therefore, it is often claimed to be the creativity parameter. Here, we investigate this claim using a narrative generation task with a predetermined fixed context, model and prompt. Specifically, we present an empirical analysis of the LLM output for different temperature values using four necessary conditions for creativity in narrative generation: novelty, typicality, cohesion, and coherence. We find that temperature is weakly correlated with novelty, and unsurprisingly, moderately correlated with incoherence, but there is no relationship with either cohesion or typicality. However, the influence of temperature on creativity is far more nuanced and weak than suggested by the "creativity parameter" claim; overall results suggest that the LLM generates slightly more novel outputs as temperatures get higher. Finally, we discuss ideas to allow more controlled LLM creativity, rather than relying on chance via changing the temperature parameter.
- 1985. A learning algorithm for boltzmann machines. Cognitive Science 9(1):147–169.
- 2019. Guided neural language generation for automated storytelling. In Second Workshop on Storytelling, 46–55. Florence, Italy: ACL.
- 2021. Mirostat: A neural text decoding algorithm that directly controls perplexity. In International Conference on Learning Representations.
- Boden, M. 1992. The Creative Mind. London: Abacus.
- 2020. Language models are few-shot learners. In Advances in Neural Information Processing Systems, volume 33, 1877–1901. Curran Associates, Inc.
- 2023. Sparks of artificial general intelligence: Early experiments with GPT-4. Preprint. arXiv:2303.12712.
- 2020. How novelists use generative language models: An exploratory user study. In HAI-GEN+user2agent.
- Casebourne, I. 1996. The grandmother program: A hybrid system for automated story generation. In Second International Symposium of Creativity and Cognition (Loughborough, England, 1996), 146–155.
- 2023. Probing the “creativity” of large language models: Can models produce divergent semantic association? In Findings of the Association for Computational Linguistics: EMNLP 2023, 12881–12888. ACL.
- Colton, S. 2008. Creativity versus the perception of creativity in computational systems. In AAAI spring symposium: creative intelligent systems, volume 8, 7.
- D’Souza, R. 2021. What characterises creativity in narrative writing, and how do we assess it? research findings from a systematic literature search. Thinking Skills and Creativity 42:100949.
- Gärdenfors, P. 2014. The geometry of meaning: Semantics based on conceptual spaces. MIT press.
- 2014. What to expect when you’re expecting: The role of unexpectedness in computationally evaluating creativity. In 5th International Conference on Computational Creativity, 120–128. ACC.
- 2004. Coh-Metrix: Analysis of text on cohesion and language. Behavior research methods, instruments, & computers 36(2):193–202.
- 2021. Toward automated story generation with markov chain monte carlo methods and deep neural networks. AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment 13(2):191–197.
- 2020. The curious case of neural text degeneration. In International Conference on Learning Representations.
- Jordanous, A. 2012. A standardised procedure for evaluating creative systems: Computational creativity evaluation based on what it is to be creative. Cogn. Compu. 4(3):246–279.
- 2023. There and back again: Extracting formal domains for controllable neurosymbolic story authoring. In AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, volume 19, 64–74.
- 2024. A study on large language models’ limitations in multiple-choice question answering. Preprint. arXiv:2401.07955.
- 2018. Evaluating computational creativity: An interdisciplinary tutorial. ACM Comput. Surv. 51(2).
- 2017. Synthetic literature: Writing science fiction in a co-creative process. In Workshop on Computational Creativity in Natural Language Generation (CC-NLG 2017), 29–37. ACL.
- Martindale, C. 1990. The Clockwork Muse: The Predictability of Artistic Change. Basic Books.
- 1984. Given versus induced category representations: Use of prototype and exemplar information in classification. Journal of Experimental Psychology: Learning, Memory, and Cognition 10(3):333.
- 1978. Context theory of classification learning. Psychological review 85(3):207.
- Meehan, J. R. 1977. TALE-SPIN, an interactive program that writes stories. In 5th International Joint Conference on Artificial Intelligence, 91–98. Morgan Kaufmann Publishers.
- 2023. Locally typical sampling. Transactions of the Association for Computational Linguistics 11:102–121.
- 2023. Using ChatGPT for story sifting in narrative generation. In 14th International Conference on Computational Creativity, 387–391. ACC.
- 2021. Incorporating algorithmic information theory into fundamental concepts of computational creativity. In 12th International Conference on Computational Creativity, 173–181. ACC.
- 2013. A general and simple method for obtaining R2 from generalized linear mixed-effects models. Methods in Ecology and Evolution 4(2):133–142.
- 2023. On characterizations of large language models and creativity evaluation. In 14th International Conference on Computational Creativity, 143–147. ACC.
- 2001. MEXICA: A computer model of a cognitive account of creative writing. Journal of Experimental and Theoretical Artificial Intelligence 13(2):119–139.
- Ritchie, G. 2007. Some empirical criteria for attributing creativity to a computer program. Minds & Machines 17:76–99.
- 2018. Automated assistance for creative writing with an rnn language model. In 23rd International Conference on Intelligent User Interfaces Companion. ACM.
- 1976. Basic objects in natural categories. Cognitive Psychology 8(3):382–439.
- Rosch, E. 1973. Natural categories. Cognitive Psychology 4(3):328–350.
- 2012. The standard definition of creativity. Creativity Research Journal 24(1):92–96.
- Satterthwaite, F. E. 1941. Synthesis of variance. Psychometrika 6(5):309–316.
- 2023. On the power of special-purpose gpt models to create and evaluate new poetry in old styles. In 14th International Conference on Computational Creativity, 10–19. ACC.
- 2010. statsmodels: Econometric and statistical modeling with Python. In 9th Python in Science Conference.
- Simonton, D. K. 2012. Taking the U.S. patent office criteria seriously: A quantitative three-criterion creativity definition and its implications. Creativity Research Journal 24(2-3):97–106.
- Simonton, D. K. 2023. The blind-variation and selective-retention theory of creativity: Recent developments and current status of BVSR. Creativity Research Journal 35(3):304–323.
- 1991. A planning mechanism for generating story text. Literary and Linguistic Computing 6(2):119–126.
- Toplyn, J. 2022. Witscript 2: A system for generating improvised jokes without wordplay. In 13th International Conference on Computational Creativity, 54–58. ACC.
- 2023. Llama 2: Open foundation and fine-tuned chat models. Preprint. arXiv:2307.09288.
- Turner, S. R. 1994. The Creative Process: A Computer Model of Storytelling and Creativity. Lawrence Erlbaum Associates.
- 2019. Narrative generation in the wild: Methods from NaNoGenMo. In Second Workshop on Storytelling, 65–74. ACL.
- 2022. Craft an iron sword: Dynamically generating interactive game characters by prompting large language models tuned on code. In 3rd Wordplay: When Language Meets Games Workshop (Wordplay 2022), 25–43. ACL.
- 2023. Mind the instructions: A holistic evaluation of consistency and interactions in prompt-based learning. In 27th Conference on Computational Natural Language Learning (CoNLL), 294–313. ACL.
- Wittgenstein, L. 1953. Philosophical Investigations. London: Blackwell.
- 2023. Neural story planning. In The AAAI-23 Workshop on Creative AI Across Modalities.
- 2022. Wordcraft: Story writing with large language models. In 27th International Conference on Intelligent User Interfaces, 841–852. ACM.
- 2012. Image characterization and classification by physical complexity. Complexity 17(3):26–42.
- 2021. Trading off diversity and quality in natural language generation. In Workshop on Human Evaluation of NLP Systems (HumEval), 25–33. ACL.
- 2023. Revisiting block-based quantisation: What is important for sub-8-bit LLM inference? In 2023 Conference on Empirical Methods in Natural Language Processing, 9988–10006. ACL.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.