Academic

SLEA-RL: Step-Level Experience Augmented Reinforcement Learning for Multi-Turn Agentic Training

Prince Zizhuang Wang, Shuli Jiang · March 20, 2026 · 1 min read · 5 views

#cs.LG #cs.AI

arXiv:2603.18079v1 Announce Type: new Abstract: Large Language Model (LLM) agents have shown strong results on multi-turn tool-use tasks, yet they operate in isolation during training, failing to leverage experiences accumulated across episodes. Existing experience-augmented methods address this by organizing trajectories into retrievable libraries, but they retrieve experiences only once based on the initial task description and hold them constant throughout the episode. In multi-turn settings where observations change at every step, this static retrieval becomes increasingly mismatched as episodes progress. We propose SLEA-RL (Step-Level Experience-Augmented Reinforcement Learning), a framework that retrieves relevant experiences at each decision step conditioned on the current observation. SLEA-RL operates through three components: (i) step-level observation clustering that groups structurally equivalent environmental states for efficient cluster-indexed retrieval; (ii) a self-evolving experience library that distills successful strategies and failure patterns through score-based admission and rate-limited extraction; and (iii) policy optimization with step-level credit assignment for fine-grained advantage estimation across multi-turn episodes. The experience library evolves alongside the policy through semantic analysis rather than gradient updates. Experiments on long-horizon multi-turn agent benchmarks demonstrate that SLEA-RL achieves superior performance compared to various reinforcement learning baselines.

Executive Summary

This article proposes SLEA-RL, a novel framework for experience-augmented reinforcement learning that addresses the limitations of existing methods in multi-turn settings. By retrieving relevant experiences at each decision step conditioned on the current observation, SLEA-RL improves performance on long-horizon multi-turn agent benchmarks. The framework consists of three components: step-level observation clustering, a self-evolving experience library, and policy optimization with step-level credit assignment. Experiments demonstrate superior performance compared to various reinforcement learning baselines. The proposed approach has the potential to advance the field of multi-agent reinforcement learning and has implications for the development of more efficient and effective AI agents.

Key Points

▸ SLEA-RL proposes a novel framework for experience-augmented reinforcement learning in multi-turn settings.
▸ The framework retrieves relevant experiences at each decision step conditioned on the current observation.
▸ SLEA-RL consists of three components: step-level observation clustering, a self-evolving experience library, and policy optimization with step-level credit assignment.

Merits

Strength

The proposed framework addresses the limitations of existing experience-augmented methods in multi-turn settings. It improves performance on long-horizon multi-turn agent benchmarks and demonstrates superior performance compared to various reinforcement learning baselines.

Innovative Approach

The use of step-level observation clustering and a self-evolving experience library is an innovative approach to experience-augmented reinforcement learning.

Efficient and Effective

The proposed framework has the potential to advance the field of multi-agent reinforcement learning and has implications for the development of more efficient and effective AI agents.

Demerits

Limitation

The proposed framework may require significant computational resources to implement and train, which could be a limitation for practitioners with limited resources.

Complexity

The use of multiple components and mechanisms in the proposed framework may introduce complexity, which could make it challenging to implement and maintain.

Evaluation

The evaluation of the proposed framework is limited to a small set of experiments, and it is unclear how well the framework would perform in more diverse and complex environments.

Expert Commentary

The proposed framework of SLEA-RL is a significant contribution to the field of multi-agent reinforcement learning and experience-augmented reinforcement learning. The use of step-level observation clustering and a self-evolving experience library is an innovative approach that has the potential to improve agent performance in complex and dynamic environments. However, the proposed framework may require significant computational resources to implement and train, and its evaluation is limited to a small set of experiments. Nevertheless, the development of the proposed framework has implications for the advancement of the field of multi-agent reinforcement learning and experience-augmented reinforcement learning, and it is an area of research that warrants further investigation.

Recommendations

✓ Further research is needed to fully evaluate the proposed framework and its potential applications.
✓ The development of more efficient and effective AI agents is a critical area of research and development, and the proposed framework has the potential to make significant contributions to this area.

Sources

arXiv - cs.LG

SLEA-RL: Step-Level Experience Augmented Reinforcement Learning for Multi-Turn Agentic Training

AI Commentary

Executive Summary

Key Points

Merits

Strength

Innovative Approach

Efficient and Effective

Demerits

Limitation

Complexity

Evaluation

Expert Commentary

Recommendations

Sources

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs

JCG, PC

HSOLLC Co., Ltd.