SPQ: An Ensemble Technique for Large Language Model Compression
arXiv:2602.18420v1 Announce Type: new Abstract: This study presents an ensemble technique, SPQ (SVD-Pruning-Quantization), for large language model (LLM) compression that combines variance-retained singular value decomposition …
Jiamin Yao, Eren Gultepe
3 views