SkillRouter: Retrieve-and-Rerank Skill Selection for LLM Agents at Scale
arXiv:2603.22455v1 Announce Type: new Abstract: As LLM agent ecosystems grow, the number of available skills (tools, plugins) has reached tens of thousands, making it infeasible to inject all skills into an agent's context. This creates a need for skill routing -- retrieving the most relevant skills from a large pool given a user task. The problem is compounded by pervasive functional overlap in community skill repositories, where many skills share similar names and purposes yet differ in implementation details. Despite its practical importance, skill routing remains under-explored. Current agent architectures adopt a progressive disclosure design -- exposing only skill names and descriptions to the agent while keeping the full implementation body hidden -- implicitly treating metadata as sufficient for selection. We challenge this assumption through a systematic empirical study on a benchmark of ~$80K skills and 75 expert-verified queries. Our key finding is that the skill body (full
arXiv:2603.22455v1 Announce Type: new Abstract: As LLM agent ecosystems grow, the number of available skills (tools, plugins) has reached tens of thousands, making it infeasible to inject all skills into an agent's context. This creates a need for skill routing -- retrieving the most relevant skills from a large pool given a user task. The problem is compounded by pervasive functional overlap in community skill repositories, where many skills share similar names and purposes yet differ in implementation details. Despite its practical importance, skill routing remains under-explored. Current agent architectures adopt a progressive disclosure design -- exposing only skill names and descriptions to the agent while keeping the full implementation body hidden -- implicitly treating metadata as sufficient for selection. We challenge this assumption through a systematic empirical study on a benchmark of ~$80K skills and 75 expert-verified queries. Our key finding is that the skill body (full implementation text) is the decisive signal: removing it causes 29--44 percentage point degradation across all retrieval methods, and cross-encoder attention analysis reveals 91.7% of attention concentrating on the body field. Motivated by this finding, we propose SkillRouter, a two-stage retrieve-and-rerank pipeline totaling only 1.2B parameters (0.6B encoder + 0.6B reranker). SkillRouter achieves 74.0% top-1 routing accuracy and delivers the strongest average result among the compact and zero-shot baselines we evaluate, while remaining deployable on consumer hardware.
Executive Summary
The article presents SkillRouter, a novel retrieve-and-rerank pipeline designed to efficiently retrieve relevant skills from a large pool for LLM agents at scale. A systematic empirical study reveals that the skill body, rather than metadata, is the decisive signal for skill selection. SkillRouter achieves 74.0% top-1 routing accuracy and outperforms compact and zero-shot baselines, while remaining deployable on consumer hardware. The proposed solution has significant implications for the scalability and efficiency of LLM agent ecosystems, which are increasingly reliant on large skill repositories. The study's findings and proposed methodology provide valuable insights into the challenges of skill routing and offer a promising solution for addressing the issue of pervasive functional overlap in community skill repositories.
Key Points
- ▸ The skill body, rather than metadata, is the decisive signal for skill selection in LLM agent ecosystems.
- ▸ SkillRouter, a two-stage retrieve-and-rerank pipeline, achieves 74.0% top-1 routing accuracy and outperforms compact and zero-shot baselines.
- ▸ The study's findings provide valuable insights into the challenges of skill routing and offer a promising solution for addressing the issue of pervasive functional overlap in community skill repositories.
Merits
Strength in Addressing a Critical Issue
The study addresses a critical issue in LLM agent ecosystems, where the number of available skills has reached tens of thousands, making it infeasible to inject all skills into an agent's context.
Novel and Efficient Solution
The proposed SkillRouter pipeline is novel and efficient, achieving 74.0% top-1 routing accuracy while remaining deployable on consumer hardware.
Demerits
Limited Generalizability
The study's findings and proposed methodology may not be directly generalizable to other domains or skill repositories with different characteristics.
Dependence on High-Quality Training Data
The effectiveness of SkillRouter may depend on the availability of high-quality training data, which may not be readily available in all scenarios.
Expert Commentary
The article presents a novel and efficient solution to the problem of skill routing in LLM agent ecosystems. The study's findings and proposed methodology provide valuable insights into the challenges of skill routing and offer a promising solution for addressing the issue of pervasive functional overlap in community skill repositories. However, the study's limitations, including its dependence on high-quality training data and limited generalizability, should be carefully considered. Overall, the article makes a significant contribution to the field of LLM agent research and development, and its findings and proposed methodology have important implications for the design and deployment of LLM agents in various applications and domains.
Recommendations
- ✓ Future research should focus on developing more generalizable and robust skill routing methods that can be applied to diverse skill repositories and domains.
- ✓ The proposed SkillRouter pipeline should be further evaluated and refined to improve its performance and scalability in various scenarios and applications.
Sources
Original: arXiv - cs.LG