RewardHackingAgents: Benchmarking Evaluation Integrity for LLM ML-Engineering Agents
arXiv:2603.11337v1 Announce Type: new Abstract: LLM agents increasingly perform end-to-end ML engineering tasks where success is judged by a single scalar test metric. This creates …
Yonas Atinafu, Robin Cohen
18 views