Academic

Academic

Academic · 1 min

Mind the Sim2Real Gap in User Simulation for Agentic Tasks

arXiv:2603.11245v1 Announce Type: new Abstract: As NLP evaluation shifts from static benchmarks to multi-turn interactive settings, LLM-based simulators have become widely used as user proxies, …

Xuhui Zhou, Weiwei Sun, Qianou Ma, Yiqing Xie, Jiarui Liu, Weihua Du, Sean Welleck, Yiming Yang, Graham Neubig, Sherry Tongshuang Wu, Maarten Sap
21 views
Academic · 1 min

LLM-Augmented Digital Twin for Policy Evaluation in Short-Video Platforms

arXiv:2603.11333v1 Announce Type: new Abstract: Short-video platforms are closed-loop, human-in-the-loop ecosystems where platform policy, creator incentives, and user behavior co-evolve. This feedback structure makes counterfactual …

Haoting Zhang (Max), Yunduan Lin (Max), Jinghai He (Max), Denglin Jiang (Max), Zuo-Jun (Max), Shen, Zeyu Zheng
21 views