Skip to main content
Academic

Dynamic System Instructions and Tool Exposure for Efficient Agentic LLMs

arXiv:2602.17046v1 Announce Type: new Abstract: Large Language Model (LLM) agents often run for many steps while re-ingesting long system instructions and large tool catalogs each turn. This increases cost, agent derailment probability, latency, and tool-selection errors. We propose Instruction-Tool Retrieval (ITR), a RAG variant that retrieves, per step, only the minimal system-prompt fragments and the smallest necessary subset of tools. ITR composes a dynamic runtime system prompt and exposes a narrowed toolset with confidence-gated fallbacks. Using a controlled benchmark with internally consistent numbers, ITR reduces per-step context tokens by 95%, improves correct tool routing by 32% relative, and cuts end-to-end episode cost by 70% versus a monolithic baseline. These savings enable agents to run 2-20x more loops within context limits. Savings compound with the number of agent steps, making ITR particularly valuable for long-running autonomous agents. We detail the method, evalua

U
Uria Franko
· · 1 min read · 6 views

arXiv:2602.17046v1 Announce Type: new Abstract: Large Language Model (LLM) agents often run for many steps while re-ingesting long system instructions and large tool catalogs each turn. This increases cost, agent derailment probability, latency, and tool-selection errors. We propose Instruction-Tool Retrieval (ITR), a RAG variant that retrieves, per step, only the minimal system-prompt fragments and the smallest necessary subset of tools. ITR composes a dynamic runtime system prompt and exposes a narrowed toolset with confidence-gated fallbacks. Using a controlled benchmark with internally consistent numbers, ITR reduces per-step context tokens by 95%, improves correct tool routing by 32% relative, and cuts end-to-end episode cost by 70% versus a monolithic baseline. These savings enable agents to run 2-20x more loops within context limits. Savings compound with the number of agent steps, making ITR particularly valuable for long-running autonomous agents. We detail the method, evaluation protocol, ablations, and operational guidance for practical deployment.

Executive Summary

The article proposes Instruction-Tool Retrieval (ITR), a method to optimize Large Language Model (LLM) agents by retrieving minimal system instructions and necessary tools per step, reducing context tokens by 95% and end-to-end episode cost by 70%. This enables agents to run 2-20x more loops within context limits, making ITR valuable for long-running autonomous agents. The method is evaluated using a controlled benchmark, demonstrating significant improvements in correct tool routing and cost savings.

Key Points

  • ITR retrieves minimal system-prompt fragments and necessary tools per step
  • Reduces per-step context tokens by 95% and end-to-end episode cost by 70%
  • Improves correct tool routing by 32% relative to a monolithic baseline

Merits

Efficient Resource Utilization

ITR optimizes resource utilization by reducing the amount of context tokens and tools required per step, resulting in significant cost savings.

Demerits

Complexity of Implementation

ITR may require significant modifications to existing LLM architectures and training protocols, which could be challenging to implement in practice.

Expert Commentary

The proposed ITR method has significant implications for the development of efficient and scalable LLM agents. By reducing the amount of context tokens and tools required per step, ITR can enable agents to run for longer periods without derailing or incurring significant costs. However, the complexity of implementing ITR in practice should not be underestimated, and further research is needed to fully realize its potential. The evaluation protocol and ablations presented in the article provide a solid foundation for understanding the benefits and limitations of ITR, and demonstrate its potential for improving the efficiency and effectiveness of LLM agents.

Recommendations

  • Further research is needed to explore the applications of ITR in various domains and to address the challenges of implementing ITR in practice.
  • Developers of LLM agents should consider incorporating ITR into their architectures to improve efficiency and reduce costs.

Sources