Disentangling Geometry, Performance, and Training in Language Models
arXiv:2602.20433v1 Announce Type: new Abstract: Geometric properties of Transformer weights, particularly the unembedding matrix, have been widely useful in language model interpretability research. Yet, their …
Atharva Kulkarni, Jacob Mitchell Springer, Arjun Subramonian, Swabha Swayamdipta
1 views