SHAPE: Stage-aware Hierarchical Advantage via Potential Estimation for LLM Reasoning
arXiv:2604.06636v1 Announce Type: new Abstract: Process supervision has emerged as a promising approach for enhancing LLM reasoning, yet existing methods fail to distinguish meaningful progress …
Zhengyang Ai, Zikang Shan, Xiaodong Ai, Jingxian Tang, Hangkai Hu, Pinyan Lu
26 views