This platform requires JavaScript for full functionality. Please enable JavaScript in your browser settings.

Quality follows upgrading

Chen Guanzhong

Articles by Chen Guanzhong

Academic · 1 min

VSPrefill: Vertical-Slash Sparse Attention with Lightweight Indexing for Long-Context Prefilling

arXiv:2603.04460v1 Announce Type: new Abstract: The quadratic complexity of self-attention during the prefill phase impedes long-context inference in large language models. Existing sparse attention methods …

Chen Guanzhong

16 views Mar 7

Chen Guanzhong

Articles by Chen Guanzhong

VSPrefill: Vertical-Slash Sparse Attention with Lightweight Indexing for Long-Context Prefilling

JCG, PC

HSOLLC Co., Ltd.