From AI Assistant to AI Scientist: Autonomous Discovery of LLM-RL Algorithms with LLM Agents
arXiv:2603.23951v1 Announce Type: new Abstract: Discovering improved policy optimization algorithms for language models remains a costly manual process requiring repeated mechanism-level modification and validation. Unlike …
Sirui Xia, Yikai Zhang, Aili Chen, Siye Wu, Siyu Yuan, Yanghua Xiao
6 views