Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use
arXiv:2603.03205v1 Announce Type: new Abstract: Agentic language models operate in a fundamentally different safety regime than chat models: they must plan, call tools, and execute …
Aradhye Agarwal, Gurdit Siyan, Yash Pandya, Joykirat Singh, Akshay Nambi, Ahmed Awadallah
4 views