Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data
Abstract: Chat models, such as ChatGPT, have shown impressive capabilities and have been rapidly adopted across numerous domains. However, these models are only accessible through a restricted API, creating barriers for new research and progress in the field. We propose a pipeline that can automatically generate a high-quality multi-turn chat corpus by leveraging ChatGPT to engage in a conversation with itself. Subsequently, we employ parameter-efficient tuning to enhance LLaMA, an open-source LLM. The resulting model, named Baize, demonstrates good performance in multi-turn dialogues with guardrails that minimize potential risks. Furthermore, we propose a new technique called Self-Distill with Feedback, to further improve the performance of the Baize models with feedback from ChatGPT. The Baize models and data are released for research purposes only at https://github.com/project-baize/baize-chatbot. An online demo is also available at https://huggingface.co/spaces/project-baize/chat-with-baize.
- Towards a human-like open-domain chatbot. arXiv preprint arXiv:2001.09977.
- Asma Ben Abacha and Dina Demner-Fushman. 2019. A question-entailment approach to question answering. BMC bioinformatics, 20(1):1–23.
- Vicuna: An open-source chatbot impressing gpt-4 with 90% chatgpt quality. https://vicuna.lmsys.org/.
- Think you have solved question answering? try arc, the ai2 reasoning challenge. arXiv preprint arXiv:1803.05457.
- A framework for few-shot language model evaluation.
- Parameter-efficient transfer learning with diff pruning. In ACL-IJCNLP, pages 4884–4896. Association for Computational Linguistics.
- Measuring massive multitask language understanding. In ICLR. OpenReview.net.
- The curious case of neural text degeneration. In ICLR. OpenReview.net.
- Parameter-efficient transfer learning for NLP. In ICML, volume 97 of Proceedings of Machine Learning Research, pages 2790–2799. PMLR.
- Lora: Low-rank adaptation of large language models. In ICLR. OpenReview.net.
- Using chatgpt to evaluate cancer myths and misconceptions: artificial intelligence and cancer information. JNCI Cancer Spectrum, 7(2):pkad015.
- Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation. In ACL-IJCNLP, pages 4582–4597. Association for Computational Linguistics.
- Controllable dialogue simulation with in-context learning. arXiv preprint arXiv:2210.04185.
- Truthfulqa: Measuring how models mimic human falsehoods. In ACL, pages 3214–3252. Association for Computational Linguistics.
- Gpt understands, too. arXiv preprint arXiv:2103.10385.
- OpenAI. 2023a. Chatgpt: Optimizing language models for dialogue.
- OpenAI. 2023b. Gpt-4 technical report.
- Language models are unsupervised multitask learners. OpenAI blog, 1(8):9.
- Multitask prompted training enables zero-shot task generalization. In ICLR. OpenReview.net.
- Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.
- Stanford alpaca: An instruction-following llama model. https://github.com/tatsu-lab/stanford_alpaca.
- Lamda: Language models for dialog applications. arXiv preprint arXiv:2201.08239.
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
- Self-instruct: Aligning language model with self generated instructions. arXiv preprint arXiv:2212.10560.
- Finetuned language models are zero-shot learners. In ICLR. OpenReview.net.
- Bitfit: Simple parameter-efficient fine-tuning for transformer-based masked language-models. In ACL, pages 1–9. Association for Computational Linguistics.
- Hellaswag: Can a machine really finish your sentence? In ACL, pages 4791–4800. Association for Computational Linguistics.
- Dialogpt: Large-scale generative pre-training for conversational response generation. arXiv preprint arXiv:1911.00536.
- Fine-tuning language models from human preferences. arXiv preprint arXiv:1909.08593.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.