EpiBench: Benchmarking Multi-turn Research Workflows for Multimodal Agents
arXiv:2604.05557v1 Announce Type: new Abstract: Scientific research follows multi-turn, multi-step workflows that require proactively searching the literature, consulting figures and tables, and integrating evidence across …