When Sensors Fail: Temporal Sequence Models for Robust PPO under Sensor Drift
arXiv:2603.04648v1 Announce Type: new Abstract: Real-world reinforcement learning systems must operate under distributional drift in their observation streams, yet most policy architectures implicitly assume fully …
Kevin Vogt-Lowell, Theodoros Tsiligkaridis, Rodney Lafuente-Mercado, Surabhi Ghatti, Shanghua Gao, Marinka Zitnik, Daniela Rus
3 views