AD-Bench: A Real-World, Trajectory-Aware Advertising Analytics Benchmark for LLM Agents
arXiv:2602.14257v1 Announce Type: new Abstract: While Large Language Model (LLM) agents have achieved remarkable progress in complex reasoning tasks, evaluating their performance in real-world environments …
Lingxiang Hu, Yiding Sun, Tianle Xia, Wenwei Li, Ming Xu, Liqun Liu, Peng Shu, Huan Yu, Jie Jiang
9 views