Automated Concept Discovery for LLM-as-a-Judge Preference Analysis
arXiv:2603.03319v1 Announce Type: cross Abstract: Large Language Models (LLMs) are increasingly used as scalable evaluators of model outputs, but their preference judgments exhibit systematic biases …
James Wedgwood, Chhavi Yadav, Virginia Smith
14 views