RUQuant: Towards Refining Uniform Quantization for Large Language Models
arXiv:2604.04013v1 Announce Type: new Abstract: The increasing size and complexity of large language models (LLMs) have raised significant challenges in deployment efficiency, particularly under resource …
Han Liu, Haotian Gao, Changya Li, Feng Zhang, Xiaotong Zhang, Wei Wang, Hong Yu
5 views