Academic

DocSage: An Information Structuring Agent for Multi-Doc Multi-Entity Question Answering

arXiv:2603.11798v1 Announce Type: new Abstract: Multi-document Multi-entity Question Answering inherently demands models to track implicit logic between multiple entities across scattered documents. However, existing Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) frameworks suffer from critical limitations: standard RAG's vector similarity-based coarse-grained retrieval often omits critical facts, graph-based RAG fails to efficiently integrate fragmented complex relationship networks, and both lack schema awareness, leading to inadequate cross-document evidence chain construction and inaccurate entity relationship deduction. To address these challenges, we propose DocSage, an end-to-end agentic framework that integrates dynamic schema discovery, structured information extraction, and schema-aware relational reasoning with error guarantees. DocSage operates through three core modules: (1) A schema discovery module dynamically infers query-specific minimal joinabl

T
Teng Lin, Yizhang Zhu, Zhengxuan Zhang, Yuyu Luo, Nan Tang
· · 1 min read · 16 views

arXiv:2603.11798v1 Announce Type: new Abstract: Multi-document Multi-entity Question Answering inherently demands models to track implicit logic between multiple entities across scattered documents. However, existing Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) frameworks suffer from critical limitations: standard RAG's vector similarity-based coarse-grained retrieval often omits critical facts, graph-based RAG fails to efficiently integrate fragmented complex relationship networks, and both lack schema awareness, leading to inadequate cross-document evidence chain construction and inaccurate entity relationship deduction. To address these challenges, we propose DocSage, an end-to-end agentic framework that integrates dynamic schema discovery, structured information extraction, and schema-aware relational reasoning with error guarantees. DocSage operates through three core modules: (1) A schema discovery module dynamically infers query-specific minimal joinable schemas to capture essential entities and relationships; (2) An extraction module transforms unstructured text into semantically coherent relational tables, enhanced by error-aware correction mechanisms to reduce extraction errors; (3) A reasoning module performs multi-hop relational reasoning over structured tables, leveraging schema awareness to efficiently align cross-document entities and aggregate evidence. This agentic design offers three key advantages: precise fact localization via SQL-powered indexing, natural support for cross-document entity joins through relational tables, and mitigated LLM attention diffusion via structured representation. Evaluations on two MDMEQA benchmarks demonstrate that DocSage significantly outperforms state-of-the-art long-context LLMs and RAG systems, achieving more than 27% accuracy improvements respectively.

Executive Summary

The article introduces DocSage, an innovative framework for Multi-Doc Multi-Entity Question Answering, addressing limitations in existing Large Language Models and Retrieval-Augmented Generation frameworks. DocSage integrates dynamic schema discovery, structured information extraction, and schema-aware relational reasoning, offering precise fact localization, cross-document entity joins, and mitigated attention diffusion. Evaluations demonstrate significant accuracy improvements over state-of-the-art models.

Key Points

  • DocSage's end-to-end agentic framework integrates schema discovery, information extraction, and relational reasoning
  • The framework operates through three core modules: schema discovery, extraction, and reasoning
  • DocSage outperforms state-of-the-art long-context LLMs and RAG systems with over 27% accuracy improvements

Merits

Improved Accuracy

DocSage's schema-aware relational reasoning and error guarantees enhance accuracy in question answering

Efficient Entity Joins

DocSage's relational tables enable natural support for cross-document entity joins

Mitigated Attention Diffusion

DocSage's structured representation reduces LLM attention diffusion, improving overall performance

Demerits

Complexity

DocSage's multi-module design may introduce complexity in implementation and maintenance

Scalability

The framework's performance may be affected by the size and complexity of the input documents

Expert Commentary

DocSage represents a significant advancement in Multi-Doc Multi-Entity Question Answering, addressing long-standing limitations in existing models. The framework's innovative design and impressive accuracy improvements make it an attractive solution for various applications. However, its complexity and scalability may pose challenges in real-world implementations. Further research is needed to explore DocSage's potential and address its limitations, but its impact on the field of NLP is undeniable.

Recommendations

  • Future research should focus on simplifying DocSage's implementation and improving its scalability
  • The framework's potential applications in various domains should be explored, including customer service, education, and research

Sources