MAVRL: Learning Reward Functions from Multiple Feedback Types with Amortized Variational Inference
arXiv:2602.15206v1 Announce Type: new Abstract: Reward learning typically relies on a single feedback type or combines multiple feedback types using manually weighted loss terms. Currently, …
Rapha\"el Baur, Yannick Metz, Maria Gkoulta, Mennatallah El-Assady, Giorgia Ramponi, Thomas Kleine Buening
7 views