Multi-Drafter Speculative Decoding with Alignment Feedback
arXiv:2604.05417v1 Announce Type: new Abstract: Speculative decoding (SD) accelerates large language model (LLM) inference by using a smaller model to draft future tokens, which are …
Taehyeon Kim, Hojung Jung, Se-Young Yun
18 views