This platform requires JavaScript for full functionality. Please enable JavaScript in your browser settings.

Bradley McDanel, Steven Li, Sruthikesh Surineni, Harshit Khaitan

Articles by Bradley McDanel, Steven Li, Sruthikesh Surineni, Harshit Khaitan

Academic · 1 min

MoE-Spec: Expert Budgeting for Efficient Speculative Decoding

arXiv:2602.16052v1 Announce Type: new Abstract: Speculative decoding accelerates Large Language Model (LLM) inference by verifying multiple drafted tokens in parallel. However, for Mixture-of-Experts (MoE) models, …

4 views Feb 20

Something extraordinary is coming.

Bradley McDanel, Steven Li, Sruthikesh Surineni, Harshit Khaitan

Articles by Bradley McDanel, Steven Li, Sruthikesh Surineni, Harshit Khaitan

MoE-Spec: Expert Budgeting for Efficient Speculative Decoding

JCG, PC

HSOLLC Co., Ltd.