Academic

Domain-Partitioned Hybrid RAG for Legal Reasoning: Toward Modular and Explainable Legal AI for India

arXiv:2602.23371v1 Announce Type: cross Abstract: Legal research in India involves navigating long and heterogeneous documents spanning statutes, constitutional provisions, penal codes, and judicial precedents, where purely keyword-based or embedding-only retrieval systems often fail to support structured legal reasoning. Recent retrieval augmented generation (RAG) approaches improve grounding but struggle with multi-hop reasoning, citation chaining, and cross-domain dependencies inherent to legal texts. We propose a domain partitioned hybrid RAG and Knowledge Graph architecture designed specifically for Indian legal research. The system integrates three specialized RAG pipelines covering Supreme Court case law, statutory and constitutional texts, and the Indian Penal Code, each optimized for domain specific retrieval. To enable relational reasoning beyond semantic similarity, we construct a Neo4j based Legal Knowledge Graph capturing structured relationships among cases, statutes,

arXiv:2602.23371v1 Announce Type: cross Abstract: Legal research in India involves navigating long and heterogeneous documents spanning statutes, constitutional provisions, penal codes, and judicial precedents, where purely keyword-based or embedding-only retrieval systems often fail to support structured legal reasoning. Recent retrieval augmented generation (RAG) approaches improve grounding but struggle with multi-hop reasoning, citation chaining, and cross-domain dependencies inherent to legal texts. We propose a domain partitioned hybrid RAG and Knowledge Graph architecture designed specifically for Indian legal research. The system integrates three specialized RAG pipelines covering Supreme Court case law, statutory and constitutional texts, and the Indian Penal Code, each optimized for domain specific retrieval. To enable relational reasoning beyond semantic similarity, we construct a Neo4j based Legal Knowledge Graph capturing structured relationships among cases, statutes, IPC sections, judges, and citations. An LLM driven agentic orchestrator dynamically routes queries across retrieval modules and the knowledge graph, fusing evidence into grounded and citation aware responses. We evaluate the system using a 40 question synthetic legal question answer benchmark curated from authoritative Indian legal sources and assessed via an LLM as a Judge framework. Results show that the hybrid architecture achieves a 70 percent pass rate, substantially outperforming a RAG only baseline at 37.5 percent, with marked improvements in completeness and legal reasoning quality. These findings demonstrate that combining domain partitioned retrieval with structured relational knowledge provides a scalable and interpretable foundation for advanced legal AI systems in the Indian judicial context.

Executive Summary

This article proposes a domain-partitioned hybrid Retrieval Augmented Generation (RAG) architecture for Indian legal research, addressing limitations of existing keyword-based and embedding-only retrieval systems. The system integrates three specialized RAG pipelines for Supreme Court case law, statutory and constitutional texts, and the Indian Penal Code. A Neo4j-based Legal Knowledge Graph captures structured relationships among cases, statutes, and IPC sections, enabling relational reasoning. The system outperforms a RAG-only baseline, achieving a 70% pass rate on a 40-question benchmark. The findings demonstrate the potential of combining domain-partitioned retrieval with structured relational knowledge for advanced legal AI systems in the Indian judicial context.

Key Points

  • Domain-partitioned hybrid RAG architecture for Indian legal research
  • Integration of three specialized RAG pipelines for domain-specific retrieval
  • Neo4j-based Legal Knowledge Graph for capturing structured relationships among legal texts

Merits

Strength in Addressing Complexity

The proposed architecture effectively addresses the complexity of Indian legal research by integrating multiple domain-specific retrieval pipelines and a structured knowledge graph, enabling more accurate and comprehensive legal reasoning.

Scalability and Interpretability

The system's modular design and use of a knowledge graph provide a scalable and interpretable foundation for advanced legal AI systems, making it easier to understand and improve the reasoning process.

Demerits

Limited Generalizability

The system's performance is evaluated on a synthetic benchmark curated from Indian legal sources, which may limit its generalizability to other jurisdictions or legal domains.

Dependence on LLMs

The system relies on Large Language Models (LLMs) for query routing and response fusion, which may introduce additional complexity and dependence on these models.

Expert Commentary

This article makes a significant contribution to the development of advanced legal AI systems, particularly in the Indian judicial context. The proposed architecture effectively addresses the complexity of Indian legal research, and its scalability and interpretability make it an attractive solution for real-world applications. However, the system's limited generalizability and dependence on LLMs are notable limitations that need to be addressed in future research. Furthermore, the article highlights the importance of explainable AI in legal contexts, which is a critical issue that requires continued attention and research.

Recommendations

  • Further evaluation of the system's performance on real-world legal datasets and in different jurisdictions
  • Investigation of methods to improve the system's generalizability and reduce dependence on LLMs

Sources