Skip to main content
Academic

Communication-Efficient Personalized Adaptation via Federated-Local Model Merging

arXiv:2602.18658v1 Announce Type: new Abstract: Parameter-efficient fine-tuning methods, such as LoRA, offer a practical way to adapt large vision and language models to client tasks. However, this becomes particularly challenging under task-level heterogeneity in federated deployments. In this regime, personalization requires balancing general knowledge with personalized knowledge, yet existing approaches largely rely on heuristic mixing rules and lack theoretical justification. Moreover, prior model merging approaches are also computation and communication intensive, making the process inefficient in federated settings. In this work, we propose Potara, a principled framework for federated personalization that constructs a personalized model for each client by merging two complementary models: (i) a federated model capturing general knowledge, and (ii) a local model capturing personalized knowledge. Through the construct of linear mode connectivity, we show that the expected task los

arXiv:2602.18658v1 Announce Type: new Abstract: Parameter-efficient fine-tuning methods, such as LoRA, offer a practical way to adapt large vision and language models to client tasks. However, this becomes particularly challenging under task-level heterogeneity in federated deployments. In this regime, personalization requires balancing general knowledge with personalized knowledge, yet existing approaches largely rely on heuristic mixing rules and lack theoretical justification. Moreover, prior model merging approaches are also computation and communication intensive, making the process inefficient in federated settings. In this work, we propose Potara, a principled framework for federated personalization that constructs a personalized model for each client by merging two complementary models: (i) a federated model capturing general knowledge, and (ii) a local model capturing personalized knowledge. Through the construct of linear mode connectivity, we show that the expected task loss admits a variance trace upper bound, whose minimization yields closed-form optimal mixing weights that guarantee a tighter bound for the merged model than for either the federated or local model alone. Experiments on vision and language benchmarks show that Potara consistently improves personalization while reducing communication, leading to a strong performance-communication trade-off.

Executive Summary

This article proposes Potara, a novel framework for federated personalization that tackles the challenge of adapting large models to client tasks under task-level heterogeneity. By merging a federated model capturing general knowledge with a local model capturing personalized knowledge, Potara achieves efficient personalization while reducing communication costs. The framework relies on linear mode connectivity and provides closed-form optimal mixing weights to guarantee a tighter bound for the merged model. Experiments on vision and language benchmarks demonstrate Potara's effectiveness in improving personalization and reducing communication costs. The framework's performance-communication trade-off is a significant improvement over existing approaches.

Key Points

  • Potara proposes a principled framework for federated personalization
  • The framework merges two complementary models: a federated model and a local model
  • Linear mode connectivity is used to derive closed-form optimal mixing weights

Merits

Strength

Potara offers a principled approach to federated personalization, providing a theoretical justification for model merging and achieving efficient communication costs.

Demerits

Limitation

The framework's applicability is limited to scenarios with a moderate number of clients, and its performance may degrade in large-scale deployments.

Expert Commentary

Potara's framework demonstrates significant potential in tackling the challenges of federated personalization. However, its applicability and performance in large-scale deployments require further investigation. Additionally, the framework's reliance on linear mode connectivity may limit its applicability to certain types of models or tasks. Nevertheless, the authors' efforts to provide a principled approach to model merging and efficient communication costs are commendable, and the framework's implications for data protection and privacy are substantial.

Recommendations

  • Future research should focus on applying Potara's framework to large-scale deployments and exploring its applicability to various types of models and tasks.
  • The authors should further investigate the limitations of linear mode connectivity and explore alternative approaches to derive optimal mixing weights.

Sources