MemoryArena: Benchmarking Agent Memory in Interdependent Multi-Session Agentic Tasks
arXiv:2602.16313v1 Announce Type: new Abstract: Existing evaluations of agents with memory typically assess memorization and action in isolation. One class of benchmarks evaluates memorization by …
Zexue He, Yu Wang, Churan Zhi, Yuanzhe Hu, Tzu-Ping Chen, Lang Yin, Ze Chen, Tong Arthur Wu, Siru Ouyang, Zihan Wang, Jiaxin Pei, Julian McAuley, Yejin Choi, Alex Pentland
6 views