Academic

Retrieval-Augmented LLM Agents: Learning to Learn from Experience

arXiv:2603.18272v1 Announce Type: new Abstract: While large language models (LLMs) have advanced the development of general-purpose agents, achieving robust generalization to unseen tasks remains a significant challenge. Current approaches typically rely on either fine-tuning or training-free memory-augmented generation using retrieved experience; yet both have limitations: fine-tuning often fails to extrapolate to new tasks, while experience retrieval often underperforms compared to supervised baselines. In this work, we propose to combine these approaches and systematically study how to train retrieval-augmented LLM agents to effectively leverage retrieved trajectories in-context. First, we establish a robust supervised fine-tuning (SFT) recipe using LoRA that outperforms several state-of-the-art agent training pipelines. Second, we provide a detailed analysis of key design choices for experience retrieval, identifying optimal strategies for storage, querying, and trajectory selecti

arXiv:2603.18272v1 Announce Type: new Abstract: While large language models (LLMs) have advanced the development of general-purpose agents, achieving robust generalization to unseen tasks remains a significant challenge. Current approaches typically rely on either fine-tuning or training-free memory-augmented generation using retrieved experience; yet both have limitations: fine-tuning often fails to extrapolate to new tasks, while experience retrieval often underperforms compared to supervised baselines. In this work, we propose to combine these approaches and systematically study how to train retrieval-augmented LLM agents to effectively leverage retrieved trajectories in-context. First, we establish a robust supervised fine-tuning (SFT) recipe using LoRA that outperforms several state-of-the-art agent training pipelines. Second, we provide a detailed analysis of key design choices for experience retrieval, identifying optimal strategies for storage, querying, and trajectory selection. Finally, we propose a pipeline that integrates experience retrieval into the fine-tuning process. Our results demonstrate that this combined approach significantly improves generalization to unseen tasks, providing a scalable and effective framework for building agents that learn to learn from experience.

Executive Summary

This article proposes a novel approach to training retrieval-augmented large language models (LLMs) to effectively learn from experience. By combining supervised fine-tuning and experience retrieval, the authors demonstrate significant improvements in generalization to unseen tasks. The approach involves establishing a robust supervised fine-tuning recipe using LoRA, analyzing key design choices for experience retrieval, and integrating experience retrieval into the fine-tuning process. The results showcase a scalable and effective framework for building agents that learn to learn from experience. This breakthrough has the potential to revolutionize the field of artificial intelligence, enabling the development of more robust and adaptive agents.

Key Points

  • Combining supervised fine-tuning and experience retrieval improves generalization to unseen tasks
  • Establishing a robust supervised fine-tuning recipe using LoRA is crucial for success
  • Optimal strategies for storage, querying, and trajectory selection are essential for experience retrieval

Merits

Robust Generalization

The proposed approach significantly improves generalization to unseen tasks, making it a valuable contribution to the field of artificial intelligence.

Scalability

The framework is scalable and effective, enabling the development of more robust and adaptive agents.

Flexibility

The approach can be applied to various tasks and domains, making it a versatile solution.

Demerits

Complexity

The proposed approach involves multiple components, which may increase complexity and make it challenging to implement and maintain.

Resource Intensity

The approach requires significant computational resources and large amounts of data, which may be a limitation for some applications.

Dependence on Dataset Quality

The success of the approach depends heavily on the quality of the dataset used for experience retrieval, which may be a limitation in certain scenarios.

Expert Commentary

The proposed approach is a significant contribution to the field of artificial intelligence, as it addresses the long-standing challenge of achieving robust generalization to unseen tasks. The combination of supervised fine-tuning and experience retrieval is a novel and effective way to leverage the strengths of both approaches. However, the complexity and resource intensity of the approach may be limitations for some applications. Additionally, the dependence on dataset quality may be a concern in certain scenarios. Nevertheless, the potential benefits of the approach make it an exciting development in the field.

Recommendations

  • Further research is needed to explore the application of the proposed approach to various tasks and domains.
  • Investigating the use of explainability and transparency techniques to improve the understanding of the agent's decision-making process is essential.

Sources