Beyond Hard Constraints: Budget-Conditioned Reachability For Safe Offline Reinforcement Learning
arXiv:2603.22292v1 Announce Type: new Abstract: Sequential decision making using Markov Decision Process underpins many realworld applications. Both model-based and model free methods have achieved strong …
Janaka Chathuranga Brahmanage, Akshat Kumar
9 views