Academic

Dynamic System Instructions and Tool Exposure for Efficient Agentic LLMs

arXiv:2602.17046v1 Announce Type: new Abstract: Large Language Model (LLM) agents often run for many steps while re-ingesting long system instructions and large tool catalogs each turn. This increases cost, agent derailment probability, latency, and tool-selection errors. We propose Instruction-Tool Retrieval (ITR), a RAG variant that retrieves, per step, only the minimal system-prompt fragments and the smallest necessary subset of tools. ITR composes a dynamic runtime system prompt and exposes a narrowed toolset with confidence-gated fallbacks. Using a controlled benchmark with internally consistent numbers, ITR reduces per-step context tokens by 95%, improves correct tool routing by 32% relative, and cuts end-to-end episode cost by 70% versus a monolithic baseline. These savings enable agents to run 2-20x more loops within context limits. Savings compound with the number of agent steps, making ITR particularly valuable for long-running autonomous agents. We detail the method, evalua

Uria Franko · February 22, 2026 · 1 min read · 6 views

#cs.AI

Executive Summary

The article proposes Instruction-Tool Retrieval (ITR), a method to optimize Large Language Model (LLM) agents by retrieving minimal system instructions and necessary tools per step, reducing context tokens by 95% and end-to-end episode cost by 70%. This enables agents to run 2-20x more loops within context limits, making ITR valuable for long-running autonomous agents. The method is evaluated using a controlled benchmark, demonstrating significant improvements in correct tool routing and cost savings.

Key Points

▸ ITR retrieves minimal system-prompt fragments and necessary tools per step
▸ Reduces per-step context tokens by 95% and end-to-end episode cost by 70%
▸ Improves correct tool routing by 32% relative to a monolithic baseline

Merits

Efficient Resource Utilization

ITR optimizes resource utilization by reducing the amount of context tokens and tools required per step, resulting in significant cost savings.

Demerits

Complexity of Implementation

ITR may require significant modifications to existing LLM architectures and training protocols, which could be challenging to implement in practice.

Expert Commentary

The proposed ITR method has significant implications for the development of efficient and scalable LLM agents. By reducing the amount of context tokens and tools required per step, ITR can enable agents to run for longer periods without derailing or incurring significant costs. However, the complexity of implementing ITR in practice should not be underestimated, and further research is needed to fully realize its potential. The evaluation protocol and ablations presented in the article provide a solid foundation for understanding the benefits and limitations of ITR, and demonstrate its potential for improving the efficiency and effectiveness of LLM agents.

Recommendations

✓ Further research is needed to explore the applications of ITR in various domains and to address the challenges of implementing ITR in practice.
✓ Developers of LLM agents should consider incorporating ITR into their architectures to improve efficiency and reduce costs.

Sources

arXiv - cs.AI

Something extraordinary is coming.

Dynamic System Instructions and Tool Exposure for Efficient Agentic LLMs

AI Commentary

Executive Summary

Key Points

Merits

Efficient Resource Utilization

Demerits

Complexity of Implementation

Expert Commentary

Recommendations

Sources

Related Articles

Humans and LLMs Diverge on Probabilistic Inferences

France or Spain or Germany or France: A Neural Account …

Multi-Agent Causal Reasoning for Suicide Ideation Detection Through Online Conversations

BRIDGE the Gap: Mitigating Bias Amplification in Automated Scoring of …

JCG, PC

HSOLLC Co., Ltd.