Not all tokens are needed(NAT): token efficient reinforcement learning
arXiv:2603.06619v1 Announce Type: new Abstract: Reinforcement learning (RL) has become a key driver of progress in large language models, but scaling RL to long chain-of-thought …
Hejian Sang, Yuanda Xu, Zhengze Zhou, Ran He, Zhipeng Wang
10 views