1-Bit Wonder: Improving QAT Performance in the Low-Bit Regime through K-Means Quantization
arXiv:2602.15563v1 Announce Type: new Abstract: Quantization-aware training (QAT) is an effective method to drastically reduce the memory footprint of LLMs while keeping performance degradation at …
Sohir Maskey, Constantin Eichenberg, Johannes Messner, Douglas Orr
8 views