The Art of Efficient Reasoning: Data, Reward, and Optimization
arXiv:2602.20945v1 Announce Type: new Abstract: Large Language Models (LLMs) consistently benefit from scaled Chain-of-Thought (CoT) reasoning, but also suffer from heavy computational overhead. To address …
Taiqiang Wu, Zenan Zu, Bo Zhou, Ngai Wong
6 views