Distribution-Aware Companding Quantization of Large Language Models
arXiv:2603.00364v1 Announce Type: new Abstract: Large language models such as GPT and Llama are trained with a next-token prediction loss. In this work, we suggest …
Athul Radhakrishnan, Siddhant Mohan, Mahima Sachdeva
1 views