M

Marmik Chaudhari, Nishkal Hundia, Idhant Gulati

Articles by Marmik Chaudhari, Nishkal Hundia, Idhant Gulati

Academic · 1 min

Sparse Crosscoders for diffing MoEs and Dense models

arXiv:2603.05805v1 Announce Type: new Abstract: Mixture of Experts (MoE) achieve parameter-efficient scaling through sparse expert routing, yet their internal representations remain poorly understood compared to …

Marmik Chaudhari, Nishkal Hundia, Idhant Gulati
3 views