Closed-form merging of parameter-efficient modules for Federated Continual Learning

Published 23 Oct 2024 in cs.LG and cs.AI | (2410.17961v2)

Abstract: Model merging has emerged as a crucial technique in Deep Learning, enabling the integration of multiple models into a unified system while preserving perfor-mance and scalability. In this respect, the compositional properties of low-rank adaptation techniques (e.g., LoRA) have proven beneficial, as simple averaging LoRA modules yields a single model that mostly integrates the capabilities of all individual modules. Building on LoRA, we take a step further by imposing that the merged model matches the responses of all learned modules. Solving this objective in closed form yields an indeterminate system with A and B as unknown variables, indicating the existence of infinitely many closed-form solutions. To address this challenge, we introduce LoRM, an alternating optimization strategy that trains one LoRA matrix at a time. This allows solving for each unknown variable individually, thus finding a unique solution. We apply our proposed methodology to Federated Class-Incremental Learning (FCIL), ensuring alignment of model responses both between clients and across tasks. Our method demonstrates state-of-the-art performance across a range of FCIL scenarios. The code to reproduce our experiments is available at github.com/aimagelab/fed-mammoth.

Abstract PDF Upgrade to Chat

Summary

The paper presents LoRM, a closed-form merging approach for LoRA modules that achieves a deterministic solution by alternating optimization between matrix variables.
It applies LoRM in Federated Class-Incremental Learning, enhancing model generalization and mitigating forgetting across sequential tasks.
Empirical evaluations on CIFAR-100, ImageNet-R, and EuroSAT demonstrate state-of-the-art performance and reduced communication overhead in federated settings.

Overview of Closed-Form Merging of Parameter-Efficient Modules for Federated Continual Learning

This paper addresses the integration of parameter-efficient modules within the framework of Federated Continual Learning (FCL), emphasizing a novel closed-form solution for merging Low-Rank Adaptation (LoRA) modules. The proposed technique, LoRM (Low-rank Regression Mean), adapts RegMean, a model merging technique rooted in regression, for the aggregation of LoRA parameters across federated clients.

Key Contributions

The research makes several notable contributions:

Closed-Form Merging of LoRA Modules: The paper introduces a method for merging LoRA modules in a closed form, overcoming the indeterminacy encountered when considering both matrices ( $and$ ) as variables. By freezing one matrix and solving for the other, the authors effectively circumvent the potential infinite solutions, thus enabling a deterministic outcome.
Application to Federated Class-Incremental Learning (FCIL): LoRM is applied within the FCIL context, addressing both spatial and temporal aggregation of learning tasks. By using the closed-form solution over communication rounds of federated clients, the approach enhances model generalization and mitigates the forgetting of prior knowledge over sequential tasks.
Empirical Evaluation and State-of-the-Art Performance: The methodology is empirically validated on benchmarks such as CIFAR-100, ImageNet-R, and EuroSAT, achieving state-of-the-art results compared to existing FCIL techniques and traditional models like EWC and LwF.

Methodology

The core of the method is built on RegMean, adapted for LoRA. Instead of straightforward averaging or coefficient-based merging used in prior works, LoRM alternates optimization between LoRA’s $and$ matrices. The closed-form solution computes unique, optimal parameters by fixing either $or$ at a time, facilitating efficient parameter sharing among clients.

For the FCIL setting, LoRM orchestrates communication rounds where clients locally optimize parameters, followed by server-side aggregation. This process effectively aligns model responses across different federated tasks incrementally introduced to clients.

Results and Implications

Numerical results demonstrate superior accuracy and efficiency on in-domain and out-of-domain datasets, particularly under scenarios of high data heterogeneity. This underscores LoRM's potential to enhance distributed learning frameworks by refining module integration techniques.

The alternating optimization strategy contributes to accelerated convergence rates, a key advantage in federated settings where communication overhead is a critical constraint. Furthermore, by reducing the parameters that need communication, LoRM offers a privacy-preserving design suitable for distributed environments.

Future Directions

This work paves the way for further exploration into PEFT methods. The closed-form nature of LoRM suggests extensibility to other module types beyond LoRA, such as VeRA. Future research may involve testing the framework across varied federated scenarios and tasks, aiming for broader applicability and robustness.

In conclusion, by advancing the methodology of parameter-efficient module composition, this paper significantly contributes to Federated Continual Learning, driving efficient and scalable models capable of dynamic, decentralized knowledge integration.

Markdown Report Issue