NuMuon: Nuclear-Norm-Constrained Muon for Compressible LLM Training
arXiv:2603.03597v1 Announce Type: new Abstract: The rapid progress of large language models (LLMs) is increasingly constrained by memory and deployment costs, motivating compression methods for …
Hadi Mohaghegh Dolatabadi, Thalaiyasingam Ajanthan, Sameera Ramasinghe, Chamin P Hewa Koneputugodage, Shamane Siriwardhana, Violetta Shevchenko, Karol Pajak, James Snewin, Gil Avraham, Alexander Long
9 views