2000 character limit reached
LLMR: Knowledge Distillation with a Large Language Model-Induced Reward
Published 19 Sep 2024 in cs.CL and cs.AI | (2409.12500v1)
Abstract: LLMs have become increasingly popular and demonstrated remarkable performance in various NLP tasks. However, these models are typically computationally expensive and difficult to be deployed in resource-constrained environments. In this paper, we propose LLMR, a novel knowledge distillation (KD) method based on a reward function induced from LLMs. We conducted experiments on multiple datasets in the dialogue generation and summarization tasks. Empirical results demonstrate that our LLMR approach consistently outperforms traditional KD methods in different tasks and datasets.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.