One-Eval: An Agentic System for Automated and Traceable LLM Evaluation
arXiv:2603.09821v1 Announce Type: new Abstract: Reliable evaluation is essential for developing and deploying large language models, yet in practice it often requires substantial manual effort: …
Chengyu Shen, Yanheng Hou, Minghui Pan, Runming He, Zhen Hao Wong, Meiyi Qiang, Zhou Liu, Hao Liang, Peichao Lai, Zeang Sheng, Wentao Zhang
11 views