Minimax Optimal Strategy for Delayed Observations in Online Reinforcement Learning
arXiv:2603.03480v1 Announce Type: new Abstract: We study reinforcement learning with delayed state observation, where the agent observes the current state after some random number of …
Harin Lee, Kevin Jamieson
10 views