See and Remember: A Multimodal Agent for Web Traversal
arXiv:2603.02626v1 Announce Type: new Abstract: Autonomous web navigation requires agents to perceive complex visual environments and maintain long-term context, yet current Large Language Model (LLM) …
Xinjun Wang, Shengyao Wang, Aimin Zhou, Hao Hao
3 views