On the Structural Non-Preservation of Epistemic Behaviour under Policy Transformation
arXiv:2602.21424v1 Announce Type: new Abstract: Reinforcement learning (RL) agents under partial observability often condition actions on internally accumulated information such as memory or inferred latent …
Alexander Galozy
4 views