Length Generalization Bounds for Transformers
arXiv:2603.02238v1 Announce Type: new Abstract: Length generalization is a key property of a learning algorithm that enables it to make correct predictions on inputs of …
Andy Yang, Pascal Bergstr\"a{\ss}er, Georg Zetzsche, David Chiang, Anthony W. Lin
19 views