Skip to main content
M

Mateusz Nowak, Xavier Cadet, Peter Chin

Articles by Mateusz Nowak, Xavier Cadet, Peter Chin

Academic · 1 min

ABCD: All Biases Come Disguised

arXiv:2602.17445v1 Announce Type: new Abstract: Multiple-choice question (MCQ) benchmarks have been a standard evaluation practice for measuring LLMs' ability to reason and answer knowledge-based questions. …

Mateusz Nowak, Xavier Cadet, Peter Chin
15 views