Dynamic System Instructions and Tool Exposure for Efficient Agentic LLMs
arXiv:2602.17046v1 Announce Type: new Abstract: Large Language Model (LLM) agents often run for many steps while re-ingesting long system instructions and large tool catalogs each turn. This increases cost, agent derailment probability, latency, and tool-selection errors. We propose Instruction-Tool Retrieval (ITR), a RAG variant that retrieves, per step, only the minimal system-prompt fragments and the smallest necessary subset of tools. ITR composes a dynamic runtime system prompt and exposes a narrowed toolset with confidence-gated fallbacks. Using a controlled benchmark with internally consistent numbers, ITR reduces per-step context tokens by 95%, improves correct tool routing by 32% relative, and cuts end-to-end episode cost by 70% versus a monolithic baseline. These savings enable agents to run 2-20x more loops within context limits. Savings compound with the number of agent steps, making ITR particularly valuable for long-running autonomous agents. We detail the method, evalua
arXiv:2602.17046v1 Announce Type: new Abstract: Large Language Model (LLM) agents often run for many steps while re-ingesting long system instructions and large tool catalogs each turn. This increases cost, agent derailment probability, latency, and tool-selection errors. We propose Instruction-Tool Retrieval (ITR), a RAG variant that retrieves, per step, only the minimal system-prompt fragments and the smallest necessary subset of tools. ITR composes a dynamic runtime system prompt and exposes a narrowed toolset with confidence-gated fallbacks. Using a controlled benchmark with internally consistent numbers, ITR reduces per-step context tokens by 95%, improves correct tool routing by 32% relative, and cuts end-to-end episode cost by 70% versus a monolithic baseline. These savings enable agents to run 2-20x more loops within context limits. Savings compound with the number of agent steps, making ITR particularly valuable for long-running autonomous agents. We detail the method, evaluation protocol, ablations, and operational guidance for practical deployment.
Executive Summary
The article proposes Instruction-Tool Retrieval (ITR), a method to optimize Large Language Model (LLM) agents by retrieving minimal system instructions and necessary tools per step, reducing context tokens by 95% and end-to-end episode cost by 70%. This enables agents to run 2-20x more loops within context limits, making ITR valuable for long-running autonomous agents. The method is evaluated using a controlled benchmark, demonstrating significant improvements in correct tool routing and cost savings.
Key Points
- ▸ ITR retrieves minimal system-prompt fragments and necessary tools per step
- ▸ Reduces per-step context tokens by 95% and end-to-end episode cost by 70%
- ▸ Improves correct tool routing by 32% relative to a monolithic baseline
Merits
Efficient Resource Utilization
ITR optimizes resource utilization by reducing the amount of context tokens and tools required per step, resulting in significant cost savings.
Demerits
Complexity of Implementation
ITR may require significant modifications to existing LLM architectures and training protocols, which could be challenging to implement in practice.
Expert Commentary
The proposed ITR method has significant implications for the development of efficient and scalable LLM agents. By reducing the amount of context tokens and tools required per step, ITR can enable agents to run for longer periods without derailing or incurring significant costs. However, the complexity of implementing ITR in practice should not be underestimated, and further research is needed to fully realize its potential. The evaluation protocol and ablations presented in the article provide a solid foundation for understanding the benefits and limitations of ITR, and demonstrate its potential for improving the efficiency and effectiveness of LLM agents.
Recommendations
- ✓ Further research is needed to explore the applications of ITR in various domains and to address the challenges of implementing ITR in practice.
- ✓ Developers of LLM agents should consider incorporating ITR into their architectures to improve efficiency and reduce costs.