Skip to main content
Academic

Learning to Rewrite Tool Descriptions for Reliable LLM-Agent Tool Use

arXiv:2602.20426v1 Announce Type: new Abstract: The performance of LLM-based agents depends not only on the agent itself but also on the quality of the tool interfaces it consumes. While prior work has focused heavily on agent fine-tuning, tool interfaces-including natural language descriptions and parameter schemas-remain largely human-oriented and often become a bottleneck, especially when agents must select from large candidate tool sets. Existing approaches to improving tool interfaces rely on execution traces, which are frequently unavailable in cold-start or privacy-constrained settings, and typically optimize each tool independently, limiting scalability and generalization to unseen tools. We propose Trace-Free+, a curriculum learning framework that progressively transfers supervision from trace-rich settings to trace-free deployment, encouraging the model to abstract reusable interface-usage patterns and tool usage outcomes. To support this approach, we construct a large-scale

R
Ruocheng Guo, Kaiwen Dong, Xiang Gao, Kamalika Das
· · 1 min read · 0 views

arXiv:2602.20426v1 Announce Type: new Abstract: The performance of LLM-based agents depends not only on the agent itself but also on the quality of the tool interfaces it consumes. While prior work has focused heavily on agent fine-tuning, tool interfaces-including natural language descriptions and parameter schemas-remain largely human-oriented and often become a bottleneck, especially when agents must select from large candidate tool sets. Existing approaches to improving tool interfaces rely on execution traces, which are frequently unavailable in cold-start or privacy-constrained settings, and typically optimize each tool independently, limiting scalability and generalization to unseen tools. We propose Trace-Free+, a curriculum learning framework that progressively transfers supervision from trace-rich settings to trace-free deployment, encouraging the model to abstract reusable interface-usage patterns and tool usage outcomes. To support this approach, we construct a large-scale dataset of high-quality tool interfaces using a structured workflow over a diverse collection of tools. Experiments on StableToolBench and RestBench show consistent gains on unseen tools, strong cross-domain generalization, and robustness as the number of candidate tools scales to over 100, demonstrating that tool interface optimization is a practical and deployable complement to agent fine-tuning.

Executive Summary

The article proposes a novel curriculum learning framework, Trace-Free+, to optimize tool interfaces for Large Language Model (LLM)-based agents. By leveraging a structured workflow and a large-scale dataset of high-quality tool interfaces, Trace-Free+ enables the model to abstract reusable interface-usage patterns and tool usage outcomes. The framework achieves consistent gains on unseen tools, strong cross-domain generalization, and robustness as the number of candidate tools scales. This approach has the potential to complement agent fine-tuning and improve the performance of LLM-based agents. The authors demonstrate the effectiveness of Trace-Free+ on two benchmark datasets, StableToolBench and RestBench, and provide evidence of its practicality and deployability.

Key Points

  • Trace-Free+ is a curriculum learning framework that optimizes tool interfaces for LLM-based agents.
  • The framework leverages a structured workflow and a large-scale dataset of high-quality tool interfaces.
  • Trace-Free+ achieves consistent gains on unseen tools, strong cross-domain generalization, and robustness as the number of candidate tools scales.

Merits

Strength

The proposed framework addresses the limitation of existing approaches that rely on execution traces, which are frequently unavailable in cold-start or privacy-constrained settings.

Strength

Trace-Free+ enables the model to abstract reusable interface-usage patterns and tool usage outcomes, leading to improved performance and generalization.

Demerits

Limitation

The framework assumes the availability of a large-scale dataset of high-quality tool interfaces, which may not be feasible for all domains or applications.

Limitation

The evaluation of Trace-Free+ is limited to two benchmark datasets, and further experiments on a wider range of datasets are needed to fully validate its effectiveness.

Expert Commentary

The article presents a well-structured and well-motivated approach to optimizing tool interfaces for LLM-based agents. The proposed framework, Trace-Free+, has the potential to complement agent fine-tuning and improve the performance of these agents. However, the evaluation of the framework is limited to two benchmark datasets, and further experiments on a wider range of datasets are needed to fully validate its effectiveness. Additionally, the assumption of the availability of a large-scale dataset of high-quality tool interfaces may not be feasible for all domains or applications. Despite these limitations, the article contributes to the ongoing research on tool interface optimization and provides a novel approach to addressing the limitations of existing methods.

Recommendations

  • Further experiments on a wider range of datasets are needed to fully validate the effectiveness of Trace-Free+.
  • The development of methods to create high-quality tool interfaces without relying on a large-scale dataset is an important area of future research.

Sources