Unveiling Language Routing Isolation in Multilingual MoE Models for Interpretable Subnetwork Adaptation
arXiv:2604.03592v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) models exhibit striking performance disparities across languages, yet the internal mechanisms driving these gaps remain poorly understood. In …