Functional Component Ablation Reveals Specialization Patterns in Hybrid Language Model Architectures
arXiv:2603.22473v1 Announce Type: new Abstract: Hybrid language models combining attention with state space models (SSMs) or linear attention offer improved efficiency, but whether both components …
Hector Borobia, Elies Segu\'i-Mas, Guillermina Tormo-Carb\'o
5 views