See the Forest for the Trees: Loosely Speculative Decoding via Visual-Semantic Guidance for Efficient Inference …
arXiv:2604.05650v1 Announce Type: new Abstract: Video Large Language Models (Video-LLMs) excel in video understanding but suffer from high inference latency during autoregressive generation. Speculative Decoding …
Yicheng Ji, Jun Zhang, Jinpeng Chen, Cong Wang, Lidan Shou, Gang Chen, Huan Li
26 views