This platform requires JavaScript for full functionality. Please enable JavaScript in your browser settings.

Quality follows upgrading

Dohyung Kim, Minbeom Kim, Jeonghye Kim, Sangmook Lee, Sojeong Rhee, Kyomin Jung

Articles by Dohyung Kim, Minbeom Kim, Jeonghye Kim, Sangmook Lee, Sojeong Rhee, Kyomin Jung

Academic · 1 min

Beyond Normalization: Rethinking the Partition Function as a Difficulty Scheduler for RLVR

arXiv:2602.12642v1 Announce Type: new Abstract: Reward-maximizing RL methods enhance the reasoning performance of LLMs, but often reduce the diversity among outputs. Recent works address this …

28 views Mar 7

Dohyung Kim, Minbeom Kim, Jeonghye Kim, Sangmook Lee, Sojeong Rhee, Kyomin Jung

Articles by Dohyung Kim, Minbeom Kim, Jeonghye Kim, Sangmook Lee, Sojeong Rhee, Kyomin Jung

Beyond Normalization: Rethinking the Partition Function as a Difficulty Scheduler for RLVR

JCG, PC

HSOLLC Co., Ltd.