This platform requires JavaScript for full functionality. Please enable JavaScript in your browser settings.

Quality follows upgrading

Guopeng Li, Matthijs T. J. Spaan, Julian F. P. Kooij

Articles by Guopeng Li, Matthijs T. J. Spaan, Julian F. P. Kooij

Academic · 1 min

Off-Policy Safe Reinforcement Learning with Constrained Optimistic Exploration

arXiv:2603.23889v1 Announce Type: new Abstract: When safety is formulated as a limit of cumulative cost, safe reinforcement learning (RL) aims to learn policies that maximize …

9 views Mar 26

Guopeng Li, Matthijs T. J. Spaan, Julian F. P. Kooij

Articles by Guopeng Li, Matthijs T. J. Spaan, Julian F. P. Kooij

Off-Policy Safe Reinforcement Learning with Constrained Optimistic Exploration

JCG, PC

HSOLLC Co., Ltd.