Unpacking Human Preference for LLMs: Demographically Aware Evaluation with the HUMAINE Framework
arXiv:2603.04409v1 Announce Type: new Abstract: The evaluation of large language models faces significant challenges. Technical benchmarks often lack real-world relevance, while existing human preference evaluations …
Nora Petrova, Andrew Gordon, Enzo Blindow
3 views