This platform requires JavaScript for full functionality. Please enable JavaScript in your browser settings.

Quality follows upgrading

Nathan Godey, Yoav Artzi

Articles by Nathan Godey, Yoav Artzi

Academic · 1 min

Lost in Backpropagation: The LM Head is a Gradient Bottleneck

arXiv:2603.10145v1 Announce Type: new Abstract: The last layer of neural language models (LMs) projects output features of dimension $D$ to logits in dimension $V$, the …

35 views Mar 12

Nathan Godey, Yoav Artzi

Articles by Nathan Godey, Yoav Artzi

Lost in Backpropagation: The LM Head is a Gradient Bottleneck

JCG, PC

HSOLLC Co., Ltd.