Academic

Tethered Reasoning: Decoupling Entropy from Hallucination in Quantized LLMs via Manifold Steering

Craig Atkinson · February 24, 2026 · 1 min read · 2 views

#cs.LG #cs.CL

arXiv:2602.17691v1 Announce Type: cross Abstract: Quantized language models face a fundamental dilemma: low sampling temperatures yield repetitive, mode-collapsed outputs, while high temperatures (T > 2.0) cause trajectory divergence and semantic incoherence. We present HELIX, a geometric framework that decouples output entropy from hallucination by tethering hidden-state trajectories to a pre-computed truthfulness manifold. HELIX computes a Unified Truth Score (UTS) combining token-level semantic entropy with Mahalanobis distance from the manifold. When UTS indicates trajectory divergence, graduated steering vectors redirect activations toward structurally coherent regions while affecting only 0.2-2.5% of tokens. On 4-bit quantized Granite 4.0 H Small (32B/9B active, hybrid Mamba-Transformer): GSM8K maintains 88.84% accuracy at T = 3.0 (2.81pp degradation from T = 0.5); MMLU maintains 72.49% across 14,042 questions (1.24pp degradation). This demonstrates that high-temperature hallucination is primarily trajectory divergence rather than semantic collapse. Notably, steering the sparse Transformer attention layers (~10% of layers) is sufficient to correct drift in the Mamba-2 state-space formulation. Geometric tethering reveals a previously-masked High-Entropy Creative Reservoir. At T > 2.0, steered outputs exhibit 5-20% idea duplication versus 70-80% at conservative settings. Cross-architecture validation (Qwen3-30B-A3B MOE) confirms this phenomenon is architecture-independent, with 46.7% higher unique concept generation. HELIX acts as a syntax tether, enabling exploration of semantic diversity without violating the logical backbone required for valid output. This enables Multi-Temperature Synthesis, generating 200% more unique concepts than single-temperature inference.

Executive Summary

This article presents HELIX, a geometric framework that decouples output entropy from hallucination in quantized language models (LLMs) by tethering hidden-state trajectories to a pre-computed truthfulness manifold. HELIX computes a Unified Truth Score (UTS) to redirect activations toward structurally coherent regions, demonstrating improved accuracy and concept generation. The framework reveals a previously-masked High-Entropy Creative Reservoir, enabling Multi-Temperature Synthesis and generating 200% more unique concepts than single-temperature inference. This breakthrough has significant implications for the development of more robust and creative LLMs, particularly in high-temperature settings.

Key Points

▸ HELIX decouples output entropy from hallucination in quantized LLMs via manifold steering
▸ Geometric tethering reveals a previously-masked High-Entropy Creative Reservoir
▸ Multi-Temperature Synthesis generates 200% more unique concepts than single-temperature inference

Merits

Improved Accuracy

HELIX demonstrates improved accuracy in high-temperature settings, with 88.84% accuracy at T = 3.0 for a 4-bit quantized Granite 4.0 H Small model.

Enhanced Creativity

The framework reveals a previously-masked High-Entropy Creative Reservoir, enabling the generation of 200% more unique concepts than single-temperature inference.

Demerits

Computational Complexity

The geometric tethering process may introduce additional computational complexity, potentially affecting model performance and scalability.

Limited Generalizability

The framework's performance and effectiveness may be limited to specific architectures and datasets, requiring further validation and adaptation.

Expert Commentary

The article presents a groundbreaking framework for decoupling output entropy from hallucination in quantized LLMs. HELIX's geometric tethering process and Unified Truth Score demonstrate a novel approach to improving model accuracy and creativity. While the framework has significant implications for the development of more robust and creative LLMs, its computational complexity and potential limited generalizability require further exploration and validation. As the field of LLMs continues to evolve, frameworks like HELIX will play a crucial role in pushing the boundaries of language understanding and generation.

Recommendations

✓ Further research is needed to explore the computational complexity of the geometric tethering process and its impact on model performance and scalability.
✓ The framework's potential limited generalizability requires further validation and adaptation to ensure its effectiveness across various architectures and datasets.

Sources

arXiv - cs.CL

Something extraordinary is coming.

Tethered Reasoning: Decoupling Entropy from Hallucination in Quantized LLMs via Manifold Steering

AI Commentary

Executive Summary

Key Points

Merits

Improved Accuracy

Enhanced Creativity

Demerits

Computational Complexity

Limited Generalizability

Expert Commentary

Recommendations

Sources

Related Articles

Humans and LLMs Diverge on Probabilistic Inferences

France or Spain or Germany or France: A Neural Account …

Multi-Agent Causal Reasoning for Suicide Ideation Detection Through Online Conversations

BRIDGE the Gap: Mitigating Bias Amplification in Automated Scoring of …

JCG, PC

HSOLLC Co., Ltd.