Value Bonuses using Ensemble Errors for Exploration in Reinforcement Learning
arXiv:2602.12375v1 Announce Type: cross Abstract: Optimistic value estimates provide one mechanism for directed exploration in reinforcement learning (RL). The agent acts greedily with respect to …
Abdul Wahab, Raksha Kumaraswamy, Martha White
10 views