Decoupling Strategy and Execution in Task-Focused Dialogue via Goal-Oriented Preference Optimization
arXiv:2602.15854v1 Announce Type: cross Abstract: Large language models show potential in task-oriented dialogue systems, yet existing training methods often rely on token-level likelihood or preference …
Jingyi Xu, Xingyu Ren, Zhiqiang You, Yumeng Zhang, Zhoupeng Shou
5 views