Multi-Drafter Speculative Decoding with Alignment Feedback
arXiv:2604.05417v1 Announce Type: new Abstract: Speculative decoding (SD) accelerates large language model (LLM) inference by using a smaller model to draft future tokens, which are …
Quality follows upgrading
Category
arXiv:2604.05417v1 Announce Type: new Abstract: Speculative decoding (SD) accelerates large language model (LLM) inference by using a smaller model to draft future tokens, which are …
arXiv:2604.05397v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly applied in high-stakes domains such as finance, healthcare, and education, where reliable multi-turn interactions …
arXiv:2604.05378v1 Announce Type: new Abstract: Recent progress in vision-language-action (VLA) models has enabled language-conditioned driving agents to execute natural-language navigation commands in closed-loop simulation, yet …
arXiv:2604.05350v1 Announce Type: new Abstract: Enterprise IT support interactions are fundamentally diagnostic: effective resolution requires iterative evidence gathering from ambiguous user reports to identify an …
arXiv:2604.05339v1 Announce Type: new Abstract: As LLMs become increasingly integrated into human society, evaluating their orientations on human values from social science has drawn growing …
arXiv:2604.05318v1 Announce Type: new Abstract: Harmful content detectors-particularly disinformation classifiers-are predominantly developed and evaluated on Standard American English (SAE), leaving their robustness to dialectal variation …
arXiv:2604.05302v1 Announce Type: new Abstract: Text simplification supports second language (L2) learning by providing comprehensible input, consistent with the Input Hypothesis. However, constructing personalized parallel …
arXiv:2604.05273v1 Announce Type: new Abstract: Human communication is fundamentally creative, and often makes use of subtext -- implied meaning that goes beyond the literal content …
arXiv:2604.05267v1 Announce Type: new Abstract: In the era of Large Language Models (LLMs), the Mixture of Experts (MoE) architecture has emerged as an effective approach …
arXiv:2604.05243v1 Announce Type: new Abstract: Background: Children do not simply learn that balls are round and blocks are square. They learn that shape is the …
arXiv:2604.05242v1 Announce Type: new Abstract: Multi-bit watermarking has emerged as a promising solution for embedding imperceptible binary messages into Large Language Model (LLM)-generated text, enabling …
arXiv:2604.05192v1 Announce Type: new Abstract: Byte Pair Encoding (BPE) is a widely used tokenization algorithm, whose tokens cannot extend across pre-tokenization boundaries, functionally limiting it …