Academic

KohakuRAG: A simple RAG framework with hierarchical document indexing

Shih-Ying Yeh, Yueh-Feng Ku, Ko-Wei Huang, Buu-Khang Tu · March 10, 2026 · 1 min read · 58 views

#cs.CL

arXiv:2603.07612v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) systems that answer questions from document collections face compounding difficulties when high-precision citations are required: flat chunking strategies sacrifice document structure, single-query formulations miss relevant passages through vocabulary mismatch, and single-pass inference produces stochastic answers that vary in both content and citation selection. We present KohakuRAG, a hierarchical RAG framework that preserves document structure through a four-level tree representation (document $\rightarrow$ section $\rightarrow$ paragraph $\rightarrow$ sentence) with bottom-up embedding aggregation, improves retrieval coverage through an LLM-powered query planner with cross-query reranking, and stabilizes answers through ensemble inference with abstention-aware voting. We evaluate on the WattBot 2025 Challenge, a benchmark requiring systems to answer technical questions from 32 documents with $\pm$0.1% numeric tolerance and exact source attribution. KohakuRAG achieves first place on both public and private leaderboards (final score 0.861), as the only team to maintain the top position across both evaluation partitions. Ablation studies reveal that prompt ordering (+80% relative), retry mechanisms (+69%), and ensemble voting with blank filtering (+1.2pp) each contribute substantially, while hierarchical dense retrieval alone matches hybrid sparse-dense approaches (BM25 adds only +3.1pp). We release KohakuRAG as open-source software at https://github.com/KohakuBlueleaf/KohakuRAG.

Executive Summary

The article introduces KohakuRAG, a retrieval-augmented generation framework that addresses challenges in high-precision citation requirements. It preserves document structure through hierarchical indexing, improves retrieval coverage with a query planner, and stabilizes answers through ensemble inference. KohakuRAG achieves first place on the WattBot 2025 Challenge, demonstrating its effectiveness in answering technical questions with exact source attribution.

Key Points

▸ Hierarchical document indexing to preserve document structure
▸ LLM-powered query planner with cross-query reranking for improved retrieval coverage
▸ Ensemble inference with abstention-aware voting for stabilized answers

Merits

Effective Retrieval

KohakuRAG's hierarchical dense retrieval approach matches hybrid sparse-dense approaches, demonstrating its effectiveness in retrieving relevant passages

Improved Answer Stability

The ensemble inference with abstention-aware voting stabilizes answers, reducing stochastic variations in content and citation selection

Demerits

Complexity

The hierarchical indexing and ensemble inference may increase computational complexity and require significant resources

Expert Commentary

KohakuRAG's innovative approach to hierarchical document indexing and ensemble inference demonstrates a significant advancement in retrieval-augmented generation. The framework's ability to preserve document structure and stabilize answers addresses long-standing challenges in the field. However, the increased complexity of the approach may require careful consideration of computational resources and potential applications. Further research is needed to explore the scalability and adaptability of KohakuRAG to various domains and tasks.

Recommendations

✓ Further evaluation of KohakuRAG on diverse datasets and tasks to assess its generalizability and robustness
✓ Investigation of potential applications of KohakuRAG in real-world scenarios, such as technical support and knowledge retrieval

Sources

arXiv - cs.CL

KohakuRAG: A simple RAG framework with hierarchical document indexing

AI Commentary

Executive Summary

Key Points

Merits

Effective Retrieval

Improved Answer Stability

Demerits

Complexity

Expert Commentary

Recommendations

Sources

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs

JCG, PC

HSOLLC Co., Ltd.

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs