Q

Qinhang Wu, Sen Lin, Ming Zhang, Yingbin Liang, Ness B. Shroff

Articles by Qinhang Wu, Sen Lin, Ming Zhang, Yingbin Liang, Ness B. Shroff

Academic · 1 min

Constraint-Rectified Training for Efficient Chain-of-Thought

arXiv:2602.12526v1 Announce Type: cross Abstract: Chain-of-Thought (CoT) has significantly enhanced the reasoning capabilities of Large Language Models (LLMs), especially when combined with reinforcement learning (RL) …

Qinhang Wu, Sen Lin, Ming Zhang, Yingbin Liang, Ness B. Shroff
4 views