Academic

Compiled Memory: Not More Information, but More Precise Instructions for Language Agents

arXiv:2603.15666v1 Announce Type: new Abstract: Existing memory systems for language agents address memory management: how to retrieve and page more information within a context budget. We address a complementary problem -- memory utility: what experience is worth keeping, and how it should change agent behavior. We present Atlas, a memory kernel that compiles accumulated task experience into an agent's instruction structure -- without fine-tuning, RAG, or human intervention. Memory is distillation, not storage; delivery is instruction rewriting, not context injection. Facts extracted from agent failures and successes are verified through a three-step promotion gate and delivered by rewriting the agent's system prompt with learned sub-bullets. On CUAD contract analysis, the evolved prompt improves GPT-4o token-level F1 by $+8.7$pp and precision by $+12.5$pp. On HotpotQA multi-hop QA, joint F1 improves $+3.16$pp. An ablation isolates the mechanism's defining property -- the training si

James Rhodes, George Kang · March 18, 2026 · 1 min read · 25 views

#cs.AI

Executive Summary

The article presents Atlas, a memory kernel that compiles accumulated task experience into an agent's instruction structure without human intervention. Atlas improves language agent performance by $+8.7$pp and $+12.5$pp in token-level F1 and precision, respectively, on the CUAD contract analysis task. The proposed mechanism is task-shaped, not model-shaped, as evidenced by its effectiveness on Claude Sonnet 4.5 using the same evolved prompt. Atlas addresses the problem of memory utility, focusing on what experience is worth keeping and how it should change agent behavior. The article highlights the potential of Atlas in improving language agent performance, especially in tasks that require precise instructions.

Key Points

▸ Atlas, a memory kernel, compiles accumulated task experience into an agent's instruction structure
▸ Atlas improves language agent performance without human intervention or fine-tuning
▸ The proposed mechanism is task-shaped, not model-shaped, demonstrating its effectiveness on different language models

Merits

Strength in Task-Shaped Learning

The Atlas mechanism learns from task-specific experiences and adapts to the particular requirements of each task, leading to improved performance.

Efficient Memory Management

Atlas distills accumulated experience into an agent's instruction structure, reducing the need for large memory storage and improving memory efficiency.

Demerits

Limited Generalizability

The effectiveness of Atlas is demonstrated on specific tasks and language models, and its generalizability to other domains and tasks remains to be explored.

Dependence on High-Quality Training Data

The quality and accuracy of the training data used to compile the agent's instruction structure are crucial for the success of Atlas, and poor data quality may lead to suboptimal performance.

Expert Commentary

The article presents a novel and promising approach to improving language agent performance through the compilation of accumulated task experience into an agent's instruction structure. The Atlas mechanism addresses the problem of memory utility, focusing on what experience is worth keeping and how it should change agent behavior. While the article demonstrates the effectiveness of Atlas on specific tasks and language models, its generalizability to other domains and tasks remains to be explored. Additionally, the dependence of Atlas on high-quality training data is a significant consideration. Nonetheless, the potential of Atlas in improving language agent performance, especially in tasks that require precise instructions, is substantial.

Recommendations

✓ Further research is needed to explore the generalizability of Atlas to other domains and tasks
✓ High-quality training data is essential for the success of Atlas, and methods for ensuring data accuracy and reliability should be developed

Sources

arXiv - cs.AI

Compiled Memory: Not More Information, but More Precise Instructions for Language Agents

AI Commentary

Executive Summary

Key Points

Merits

Strength in Task-Shaped Learning

Efficient Memory Management

Demerits

Limited Generalizability

Dependence on High-Quality Training Data

Expert Commentary

Recommendations

Sources

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs

JCG, PC

HSOLLC Co., Ltd.

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs