TimeWarp: Evaluating Web Agents by Revisiting the Past
arXiv:2603.04949v1 Announce Type: new Abstract: The improvement of web agents on current benchmarks raises the question: Do today's agents perform just as well when the …
Md Farhan Ishmam, Kenneth Marino
3 views