Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM
Abstract: In conversational AI research, there's a noticeable trend towards developing models with a larger number of parameters, exemplified by models like ChatGPT. While these expansive models tend to generate increasingly better chat responses, they demand significant computational resources and memory. This study explores a pertinent question: Can a combination of smaller models collaboratively achieve comparable or enhanced performance relative to a singular large model? We introduce an approach termed "blending", a straightforward yet effective method of integrating multiple chat AIs. Our empirical evidence suggests that when specific smaller models are synergistically blended, they can potentially outperform or match the capabilities of much larger counterparts. For instance, integrating just three models of moderate size (6B/13B paramaeters) can rival or even surpass the performance metrics of a substantially larger model like ChatGPT (175B+ paramaters). This hypothesis is rigorously tested using A/B testing methodologies with a large user base on the Chai research platform over a span of thirty days. The findings underscore the potential of the "blending" strategy as a viable approach for enhancing chat AI efficacy without a corresponding surge in computational demands.
- Towards a human-like open-domain chatbot. CoRR, abs/2001.09977.
- A general language assistant as a laboratory for alignment. CoRR, abs/2112.00861.
- PLATO-2: towards building an open-domain chatbot via curriculum learning. CoRR, abs/2006.16779.
- A survey on dialogue systems: Recent advances and new frontiers. SIGKDD Explor. Newsl., 19(2):25–35.
- Ritvik Choudhary and Daisuke Kawahara. 2022. Grounding in social media: An approach to building a chit-chat dialogue model. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Student Research Workshop, pages 9–15, Hybrid: Seattle, Washington + Online. Association for Computational Linguistics.
- Deep reinforcement learning from human preferences. Advances in neural information processing systems, 30.
- Plug and play language models: A simple approach to controlled text generation. CoRR, abs/1912.02164.
- Summeval: Re-evaluating summarization evaluation.
- Ensemble distillation approaches for grammatical error correction. In ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 2745–2749.
- Ensemble distillation for neural machine translation. arXiv preprint arXiv:1702.01802.
- High quality rather than high model probability: Minimum Bayes risk decoding with neural metrics. Transactions of the Association for Computational Linguistics, 10:811–825.
- Iason Gabriel. 2020. Artificial intelligence, values and alignment. CoRR, abs/2001.09768.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778.
- Rewarding chatbots for real-world engagement with millions of users.
- Llm-blender: Ensembling large language models with pairwise ranking and generative fusion.
- The relative performance of ensemble methods with deep convolutional neural networks for image classification. Journal of Applied Statistics, 45(15):2800–2818.
- Deep learning-and word embedding-based heterogeneous classifier ensembles for text classification. Complexity, 2018.
- Shankar Kumar and William Byrne. 2004. Minimum Bayes-risk decoding for statistical machine translation. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004, pages 169–176, Boston, Massachusetts, USA. Association for Computational Linguistics.
- Scalable agent alignment via reward modeling: a research direction. CoRR, abs/1811.07871.
- Summary of chatgpt/gpt-4 research and perspective towards the future of large language models.
- Yong Liu and Xin Yao. 1999. Ensemble learning via negative correlation. Neural networks, 12(10):1399–1404.
- Llm comparative assessment: Zero-shot nlg evaluation through pairwise comparisons using large language models.
- Andrey Malinin and Mark Gales. 2021. Uncertainty estimation in autoregressive structured prediction. In International Conference on Learning Representations.
- Cued at probsum 2023: Hierarchical ensemble of summarization models.
- BARD: A structured technique for group elicitation of bayesian networks to support analytic reasoning. Risk Analysis, 42(6):1155–1178.
- Training language models to follow instructions with human feedback. arXiv preprint arXiv:2203.02155.
- Proceedings of the 3rd Workshop on Natural Language Processing for Conversational AI. Association for Computational Linguistics, Online.
- Vyas Raina and Mark Gales. 2023. Minimum bayes’ risk decoding for system combination of grammatical error correction systems.
- Universal adversarial attacks on spoken language assessment systems. In Interspeech 2020. ISCA.
- Recipes for building an open-domain chatbot. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 300–325, Online. Association for Computational Linguistics.
- Combining outputs from multiple machine translation systems. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference, pages 228–235, Rochester, New York. Association for Computational Linguistics.
- Ensemble feature selection: homogeneous and heterogeneous approaches. Knowledge-Based Systems, 118:124–139.
- Improving neural machine translation models with monolingual data. arXiv preprint arXiv:1511.06709.
- Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
- Learning to summarize with human feedback. Advances in Neural Information Processing Systems, 33:3008–3021.
- Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1–9.
- Attention is all you need. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc.
- Joseph Weizenbaum. 1966. Eliza—a computer program for the study of natural language communication between man and machine. Commun. ACM, 9(1):36–45.
- David H Wolpert. 1992. Stacked generalization. Neural networks, 5(2):241–259.
- Deep learning for dialogue systems: Chit-chat and beyond. Foundations and Trends® in Information Retrieval, 15(5):417–589.
- A short survey of pre-trained language models for conversational ai-a new age in nlp. In Proceedings of the Australasian Computer Science Week Multiconference, ACSW ’20, New York, NY, USA. Association for Computing Machinery.
- Zhenyi Zhu. 2022. A simple survey of pre-trained language models. Preprints.org 202208.0238.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.