Papers
Topics
Authors
Recent
Search
2000 character limit reached

Prompt Exploration with Prompt Regression

Published 17 May 2024 in cs.CL and cs.LG | (2405.11083v2)

Abstract: In the advent of democratized usage of LLMs, there is a growing desire to systematize LLM prompt creation and selection processes beyond iterative trial-and-error. Prior works majorly focus on searching the space of prompts without accounting for relations between prompt variations. Here we propose a framework, Prompt Exploration with Prompt Regression (PEPR), to predict the effect of prompt combinations given results for individual prompt elements as well as a simple method to select an effective prompt for a given use-case. We evaluate our approach with open-source LLMs of different sizes on several different tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. Constitutional AI: Harmlessness from AI Feedback, December 2022.
  2. Ralph A. Bradley. 14 paired comparisons: Some basic procedures and examples. In Nonparametric Methods, volume 4 of Handbook of Statistics, pp.  299–326. Elsevier, 1984. doi: https://doi.org/10.1016/S0169-7161(84)04016-5. URL https://www.sciencedirect.com/science/article/pii/S0169716184040165.
  3. Language Models are Few-Shot Learners. arXiv:2005.14165 [cs], June 2020.
  4. Programming with linear fractional functionals. Naval Research logistics quarterly, 9(3-4):181–186, 1962.
  5. Free dolly: Introducing the world’s first truly open instruction-tuned llm, 2023. URL https://www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm.
  6. Mistral 7b, 2023.
  7. How can we know what language models know? Transactions of the Association for Computational Linguistics, 8:423–438, 2020.
  8. Camel: Communicative agents for ”mind” exploration of large language model society. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  9. What makes good in-context examples for gpt-3? In Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, pp.  100–114, 2022.
  10. R Duncan Luce. Individual choice behavior. 1959.
  11. Cross-task generalization via natural language crowdsourcing instructions. In ACL, 2022.
  12. Adversarial NLI: A new benchmark for natural language understanding. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2020.
  13. Training language models to follow instructions with human feedback. In Advances in Neural Information Processing Systems, October 2022.
  14. R. L. Plackett. The Analysis of Permutations. Journal of the Royal Statistical Society: Series C (Applied Statistics), 24(2):193–202, 1975. ISSN 1467-9876. doi: 10.2307/2346567.
  15. Grips: Gradient-free, edit-based instruction search for prompting large language models. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pp.  3827–3846, 2023.
  16. Direct Preference Optimization: Your Language Model is Secretly a Reward Model, May 2023.
  17. Paramesh Ray. Independence of irrelevant alternatives. Econometrica: Journal of the Econometric Society, pp. 987–991, 1973.
  18. Hatecheck: Functional tests for hate speech detection models. arXiv preprint arXiv:2012.15606, 2020.
  19. Quantifying language models’ sensitivity to spurious features in prompt design or: How i learned to start worrying about prompt formatting. arXiv preprint arXiv:2310.11324, 2023.
  20. Best arm identification for prompt learning under a limited budget. arXiv preprint arXiv:2402.09723, 2024.
  21. Autoprompt: Eliciting knowledge from language models with automatically generated prompts. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp.  4222–4235, 2020.
  22. Principle-driven self-alignment of language models from scratch with minimal human supervision. arXiv preprint arXiv:2305.03047, 2023.
  23. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023.
  24. Bertscore: Evaluating text generation with bert. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=SkeHuCVFDr.
  25. Auto-instruct: Automatic instruction generation and ranking for black-box language models. In Findings of the Association for Computational Linguistics: EMNLP 2023, pp.  9850–9867, 2023.
  26. Calibrate before use: Improving few-shot performance of language models. In International conference on machine learning, pp. 12697–12706. PMLR, 2021.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 4 tweets with 13 likes about this paper.