Learning to Reason with Curriculum I: Provable Benefits of Autocurriculum
arXiv:2603.18325v1 Announce Type: new Abstract: Chain-of-thought reasoning, where language models expend additional computation by producing thinking tokens prior to final responses, has driven significant advances …
Nived Rajaraman, Audrey Huang, Miro Dudik, Robert Schapire, Dylan J. Foster, Akshay Krishnamurthy
6 views