MineDraft: A Framework for Batch Parallel Speculative Decoding
arXiv:2603.18016v1 Announce Type: new Abstract: Speculative decoding (SD) accelerates large language model inference by using a smaller draft model to propose draft tokens that are …
Zhenwei Tang, Arun Verma, Zijian Zhou, Zhaoxuan Wu, Alok Prakash, Daniela Rus, Bryan Kian Hsiang Low
27 views