Panini: Continual Learning in Token Space via Structured Memory
arXiv:2602.15156v1 Announce Type: new Abstract: Language models are increasingly used to reason over content they were not trained on, such as new documents, evolving knowledge, and user-specific data. A common approach is retrieval-augmented generation (RAG), which stores verbatim documents externally (as chunks) and retrieves only a relevant subset at inference time for an LLM to reason over. However, this results in inefficient usage of test-time compute (LLM repeatedly reasons over the same documents); moreover, chunk retrieval can inject irrelevant context that increases unsupported generation. We propose a human-like non-parametric continual learning framework, where the base model remains fixed, and learning occurs by integrating each new experience into an external semantic memory state that accumulates and consolidates itself continually. We present Panini, which realizes this by representing documents as Generative Semantic Workspaces (GSW) -- an entity- and event-aware netw
arXiv:2602.15156v1 Announce Type: new Abstract: Language models are increasingly used to reason over content they were not trained on, such as new documents, evolving knowledge, and user-specific data. A common approach is retrieval-augmented generation (RAG), which stores verbatim documents externally (as chunks) and retrieves only a relevant subset at inference time for an LLM to reason over. However, this results in inefficient usage of test-time compute (LLM repeatedly reasons over the same documents); moreover, chunk retrieval can inject irrelevant context that increases unsupported generation. We propose a human-like non-parametric continual learning framework, where the base model remains fixed, and learning occurs by integrating each new experience into an external semantic memory state that accumulates and consolidates itself continually. We present Panini, which realizes this by representing documents as Generative Semantic Workspaces (GSW) -- an entity- and event-aware network of question-answer (QA) pairs, sufficient for an LLM to reconstruct the experienced situations and mine latent knowledge via reasoning-grounded inference chains on the network. Given a query, Panini only traverses the continually-updated GSW (not the verbatim documents or chunks), and retrieves the most likely inference chains. Across six QA benchmarks, Panini achieves the highest average performance, 5%-7% higher than other competitive baselines, while using 2-30x fewer answer-context tokens, supports fully open-source pipelines, and reduces unsupported answers on curated unanswerable queries. The results show that efficient and accurate structuring of experiences at write time -- as achieved by the GSW framework -- yields both efficiency and reliability gains at read time. Code is available at https://github.com/roychowdhuryresearch/gsw-memory.
Executive Summary
Panini, a novel continual learning framework, proposes a human-like non-parametric approach to integrate new experiences into an external semantic memory state. This framework, represented as Generative Semantic Workspaces (GSW), enables the reuse of previously learned knowledge to reason over new documents. Panini achieves superior performance on six QA benchmarks, using fewer tokens and reducing unsupported answers. The results demonstrate the efficacy of structuring experiences at write time for efficiency and reliability gains at read time. This breakthrough has significant implications for language models, particularly in real-world applications where knowledge is constantly evolving.
Key Points
- ▸ Panini introduces a non-parametric continual learning framework for integrating new experiences into an external semantic memory state.
- ▸ Generative Semantic Workspaces (GSW) represent documents as entity- and event-aware networks of question-answer pairs.
- ▸ Panini achieves superior performance on six QA benchmarks, outperforming competitive baselines and using fewer tokens.
Merits
Strength in Continual Learning
Panini's non-parametric approach enables the model to adapt to new experiences without requiring retraining, making it more efficient and effective for real-world applications.
Improved Efficiency
By representing documents as GSW, Panini reduces the need for retrieving verbatim documents or chunks, resulting in a significant decrease in test-time compute and memory usage.
Enhanced Reliability
Panini's structuring of experiences at write time enables the model to reason more accurately and reliably over new documents, reducing the occurrence of unsupported answers.
Demerits
Limited Domain Adaptation
Panini's GSW framework may not be directly applicable to domains with significant changes in terminology, concepts, or data distributions, which could limit its adaptability to new domains.
Scalability Concerns
As the size of the GSW network grows, Panini's performance may degrade due to increased computational complexity and memory requirements, potentially limiting its scalability.
Expert Commentary
Panini is a groundbreaking contribution to the field of natural language processing, offering a novel approach to continual learning and knowledge representation. While there are limitations to its adaptability and scalability, the framework's performance and efficiency gains make it a compelling solution for real-world applications. Future research should investigate the connections between Panini and related areas, such as knowledge graph embeddings and reasoning-grounded inference chains, to further advance our understanding of knowledge representation and retrieval.
Recommendations
- ✓ Further investigation into the scalability concerns of Panini's GSW framework is necessary to ensure its applicability to large-scale applications.
- ✓ Exploring the application of Panini's reasoning-grounded inference chains to other domains, such as vision or reasoning tasks, could lead to significant breakthroughs in these areas.