RAPO: Expanding Exploration for LLM Agents via Retrieval-Augmented Policy Optimization
arXiv:2603.03078v1 Announce Type: new Abstract: Agentic Reinforcement Learning (Agentic RL) has shown remarkable potential in large language model-based (LLM) agents. These works can empower LLM …
Siwei Zhang, Yun Xiong, Xi Chen, Zi'an Jia, Renhong Huang, Jiarong Xu, Jiawei Zhang
11 views