Training-free Dropout Sampling for Semantic Token Acceptance in Speculative Decoding
arXiv:2603.03333v1 Announce Type: new Abstract: Speculative decoding accelerates large language model inference by proposing tokens with a lightweight draft model and selectively accepting them using …
Jeongtae Lee, Minjung Jo, Hyunjoon Jeong, Gunho Park, Sunghyeon Woo, Joonghoon Kim, Se Jung Kwon, Dongsoo Lee
3 views