RMNP: Row-Momentum Normalized Preconditioning for Scalable Matrix-Based Optimization
arXiv:2603.20527v1 Announce Type: new Abstract: Preconditioned adaptive methods have gained significant attention for training deep neural networks, as they capture rich curvature information of the …
Shenyang Deng, Zhuoli Ouyang, Tianyu Pang, Zihang Liu, Ruochen Jin, Shuhua Yu, Yaoqing Yang
6 views