Academic

Draft-Conditioned Constrained Decoding for Structured Generation in LLMs

arXiv:2603.03305v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly used to generate executable outputs, JSON objects, and API calls, where a single syntax error can make the output unusable. Constrained decoding enforces validity token-by-token via masking and renormalization, but it can distort generation when the model assigns low probability mass to valid continuations, pushing decoding toward locally valid yet semantically incorrect trajectories. We propose \emph{Draft-Conditioned Constrained Decoding (DCCD)}, a simple two-step, training-free inference procedure that decouples semantic planning from structural enforcement: an unconstrained draft is generated first, and constrained decoding is then applied, conditioned on this draft, to guarantee validity. We analyze DCCD through a KL-projection view, showing that draft conditioning increases feasible mass and reduces the cumulative "projection tax" induced by hard constraints, with an optional best-of-

Avinash Reddy, Thayne T. Walker, James S. Ide, Amrit Singh Bedi · March 6, 2026 · 1 min read · 27 views

#cs.CL #cs.AI #cs.LG

Executive Summary

The article proposes Draft-Conditioned Constrained Decoding (DCCD), a two-step inference procedure that decouples semantic planning from structural enforcement in Large Language Models (LLMs). By generating an unconstrained draft first and then applying constrained decoding conditioned on this draft, DCCD increases feasible mass and reduces the cumulative 'projection tax' induced by hard constraints. The method is shown to improve strict structured accuracy by up to +24 percentage points and enables smaller model pairs to match or exceed much larger constrained baselines, yielding substantial gains in parameter efficiency. The approach provides a promising solution for generating executable outputs, JSON objects, and API calls in LLMs, where a single syntax error can render the output unusable.

Key Points

▸ DCCD decouples semantic planning from structural enforcement in LLMs
▸ Draft conditioning increases feasible mass and reduces the cumulative 'projection tax'
▸ DCCD improves strict structured accuracy by up to +24 percentage points
▸ Smaller model pairs can match or exceed larger constrained baselines
▸ Parameter efficiency is substantially improved

Merits

Strength

DCCD's ability to decouple semantic planning from structural enforcement allows for more flexible and efficient generation of structured outputs.

Demerits

Limitation

The approach assumes that the unconstrained draft is a good representation of the desired output, which may not always be the case.

Expert Commentary

The article presents a well-motivated and innovative approach to constrained decoding in LLMs. The use of draft conditioning to increase feasible mass and reduce the cumulative 'projection tax' is a key insight that has the potential to significantly improve the generation of structured outputs. However, the assumption that the unconstrained draft is a good representation of the desired output may not always hold, and further research is needed to address this limitation. Additionally, the approach may be sensitive to the choice of draft selection method and the hyperparameters of the constrained decoding step. Nonetheless, DCCD is a promising direction for research in LLMs and has the potential to lead to significant advancements in natural language processing and generation.

Recommendations

✓ Future research should focus on addressing the limitation of the approach and exploring alternative draft selection methods and hyperparameter tuning strategies.
✓ The use of DCCD should be explored in other applications where structured generation is critical, such as in software development and data science.

Sources

arXiv - cs.AI

Draft-Conditioned Constrained Decoding for Structured Generation in LLMs

AI Commentary

Executive Summary

Key Points

Merits

Strength

Demerits

Limitation

Expert Commentary

Recommendations

Sources

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs

JCG, PC

HSOLLC Co., Ltd.

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs