Skip to main content
Z

Zimeng Li, Mudit Gaur, Vaneet Aggarwal

Articles by Zimeng Li, Mudit Gaur, Vaneet Aggarwal

Academic · 1 min

Oracle-Robust Online Alignment for Large Language Models

arXiv:2602.20457v1 Announce Type: new Abstract: We study online alignment of large language models under misspecified preference feedback, where the observed preference oracle deviates from an …

Zimeng Li, Mudit Gaur, Vaneet Aggarwal
4 views