Merge to Learn: Efficiently Adding Skills to Language Models with Model Merging

Published 16 Oct 2024 in cs.CL and cs.LG | (2410.12937v1)

Abstract: Adapting general-purpose LLMs to new skills is currently an expensive process that must be repeated as new instruction datasets targeting new skills are created, or can cause the models to forget older skills. In this work, we investigate the effectiveness of adding new skills to preexisting models by training on the new skills in isolation and later merging with the general model (e.g. using task vectors). In experiments focusing on scientific literature understanding, safety, and coding, we find that the parallel-train-then-merge procedure, which is significantly cheaper than retraining the models on updated data mixtures, is often comparably effective. Our experiments also show that parallel training is especially well-suited for enabling safety features in LMs relative to continued finetuning and retraining, as it dramatically improves model compliance with safe prompts while preserving its ability to refuse dangerous or harmful prompts.

Abstract PDF HTML Upgrade to Chat

Summary

The paper presents the Parallel Train Then Merge (PTM) method to efficiently add new skills to language models while preserving core capabilities.
The PTM approach reduces training costs by 50–95% compared to retraining and outperforms continued fine-tuning in maintaining safety and performance.
Experimental results show that PTM effectively balances the integration of new skills with the preservation of general model performance.

Efficient Skill Addition to LLMs: A Study on Model Merging Techniques

This paper presents a meticulous exploration into the process of incorporating new skills into preexisting instruction-tuned LLMs through a method termed as "parallel train then merge" (PTM). The authors focus on addressing the challenge of augmenting general-purpose LLMs with new skills without necessitating the expensive retraining of models with combined datasets, and without causing the model to forget previously learned skills.

Methodology

The researchers compare three primary methods for model augmentation:

Continued Finetuning (CFT): This method involves further training the model on new skill-specific datasets. While computationally inexpensive, it often results in the degradation of general skills due to the forgetting phenomenon.
Retraining (RT) from Scratch: This entails starting the instruction-tuning process afresh by incorporating both old and new datasets. Although effective in preserving both old and new skill sets, this approach is computationally demanding and may not be feasible if the original training datasets are unavailable.
Parallel Train Then Merge (PTM): This approach involves training a new model specifically on new skills, creating task vectors, and then merging these vectors with the original model. The main advantage of PTM is that it does not require access to the original datasets and is computationally efficient.

Experimental Results

The researchers conducted experiments focusing on three skill domains: scientific literature understanding, coding, and safety. They also addressed the often discordant requirement of improving safety-related refusals while maintaining general capabilities. The models incorporated these skills into a base model known as T\"ulu, using the PTM method.

Key Findings:

Skill Addition Efficiency: PTM demonstrated competitive skill-specific performance comparable to the best RT models with a 50–95% reduction in training cost. It also preserved nearly all of the original model’s general skills, exhibiting superior performance metrics relative to CFT.
Safety Compliance: For safety-related refusals, PTM starkly improved unsafe refusal rates and significantly reduced exaggerated refusals by 30–80%, outperforming both CFT and RT significantly in this aspect.
Generalization Performance: PTM effectively preserved the general capabilities of the model while enabling new skills, presenting a slight trade-off in general performance compared to full retraining.

Practical and Theoretical Implications

Practically, PTM offers a valuable solution for efficiently adapting models to incorporate new datasets and emerging skills, especially pertinent in settings where original data is unaccessible. Theoretically, this approach underscores the potential of model merging frameworks in avoiding the catastrophic forgetting and heavy computational costs associated with traditional fine-tuning and retraining strategies.

Future Directions

The study opens avenues for exploring further generalization of PTM across various model architectures and datasets. Investigating techniques to optimize the merging process for multiple skills while minimizing interference remains a prospect for future research. Additionally, examining PTM’s utility within reinforcement learning frameworks or extending it for large models with instruction-tuning and post-training adaptations, such as RLHF, could expand its applicability.

The authors demonstrate through rigorous experiments and analyses that PTM offers a compelling alternative to existing methods, balancing the cost-effectiveness with the retention and expansion of model functionalities. This study contributes significantly to the field of efficient model adaptation, highlighting the intricate trade-offs between computational expense and performance continuity in evolving AI systems.