Beyond Binary Preferences: A Principled Framework for Reward Modeling with Ordinal Feedback
arXiv:2603.02232v1 Announce Type: new Abstract: Reward modeling is crucial for aligning large language models with human preferences, yet current approaches lack a principled mathematical framework …
Amirhossein Afsharrad, Ruida Zhou, Luca Viano, Sanjay Lall, Mohammad Ghavamzadeh
14 views