How to Train Your Deep Research Agent? Prompt, Reward, and Policy Optimization in Search-R1
arXiv:2602.19526v1 Announce Type: new Abstract: Deep Research agents tackle knowledge-intensive tasks through multi-round retrieval and decision-oriented generation. While reinforcement learning (RL) has been shown to …