Efficient Inference for Large Vision-Language Models: Bottlenecks, Techniques, and Prospects
arXiv:2604.05546v1 Announce Type: new Abstract: Large Vision-Language Models (LVLMs) enable sophisticated reasoning over images and videos, yet their inference is hindered by a systemic efficiency …
Jun Zhang, Yicheng Ji, Feiyang Ren, Yihang Li, Bowen Zeng, Zonghao Chen, Ke Chen, Lidan Shou, Gang Chen, Huan Li
8 views