$\kappa$-Explorer: A Unified Framework for Active Model Estimation in MDPs
arXiv:2602.20404v1 Announce Type: new Abstract: In tabular Markov decision processes (MDPs) with perfect state observability, each trajectory provides active samples from the transition distributions conditioned …
Xihe Gu, Urbashi Mitra, Tara Javidi
6 views