Academic

Think Fast and Slow: Step-Level Cognitive Depth Adaptation for LLM Agents

arXiv:2602.12662v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly deployed as autonomous agents for multi-turn decision-making tasks. However, current agents typically rely on fixed cognitive patterns: non-thinking models generate immediate responses, while thinking models engage in deep reasoning uniformly. This rigidity is inefficient for long-horizon tasks, where cognitive demands vary significantly from step to step, with some requiring strategic planning and others only routine execution. In this paper, we introduce CogRouter, a framework that trains agents to dynamically adapt cognitive depth at each step. Grounded in ACT-R theory, we design four hierarchical cognitive levels ranging from instinctive responses to strategic planning. Our two-stage training approach includes Cognition-aware Supervised Fine-tuning (CoSFT) to instill stable level-specific patterns, and Cognition-aware Policy Optimization (CoPO) for step-level credit assignment via confide

Ruihan Yang, Fanghua Ye, Xiang We, Ruoqing Zhao, Kang Luo, Xinbo Xu, Bo Zhao, Ruotian Ma, Shanyi Wang, Zhaopeng Tu, Xiaolong Li, Deqing Yang, Linus · March 7, 2026 · 1 min read · 28 views

#cs.AI #cs.CL

Executive Summary

The article 'Think Fast and Slow: Step-Level Cognitive Depth Adaptation for LLM Agents' introduces CogRouter, a novel framework designed to enhance the efficiency and performance of large language models (LLMs) as autonomous agents. The framework dynamically adapts the cognitive depth of LLMs at each decision-making step, drawing from ACT-R theory to establish four hierarchical cognitive levels. The two-stage training approach, involving Cognition-aware Supervised Fine-tuning (CoSFT) and Cognition-aware Policy Optimization (CoPO), aims to optimize the cognitive depth to maximize action confidence. Experiments on ALFWorld and ScienceWorld demonstrate significant improvements in success rates and token efficiency, outperforming other models like GPT-4o and OpenAI-o3.

Key Points

▸ Introduction of CogRouter framework for dynamic cognitive depth adaptation in LLMs.
▸ Four hierarchical cognitive levels based on ACT-R theory.
▸ Two-stage training approach: CoSFT and CoPO.
▸ Superior performance and efficiency in experiments compared to existing models.
▸ Achieves 82.3% success rate with Qwen2.5-7B, using 62% fewer tokens.

Merits

Innovative Framework

CogRouter represents a significant advancement in the field of autonomous agents by introducing a dynamic cognitive depth adaptation mechanism. This innovation addresses the rigidity of current models, which either rely on immediate responses or uniform deep reasoning.

Theoretical Foundation

The framework is grounded in ACT-R theory, providing a robust theoretical basis for the hierarchical cognitive levels. This theoretical grounding enhances the credibility and potential applicability of the model.

Empirical Success

The experimental results demonstrate significant improvements in success rates and token efficiency, outperforming state-of-the-art models. This empirical success validates the effectiveness of the CogRouter framework.

Demerits

Complexity

The two-stage training approach, while effective, adds complexity to the model. This complexity may pose challenges in terms of computational resources and implementation, potentially limiting its accessibility.

Generalizability

The experiments are conducted on specific environments (ALFWorld and ScienceWorld), which may not fully represent the diverse range of real-world applications. Further research is needed to assess the generalizability of the framework.

Token Efficiency

While the framework achieves significant token efficiency, the reduction in token usage may impact the depth of reasoning in certain contexts, potentially affecting the quality of decision-making.

Expert Commentary

The article presents a compelling advancement in the field of autonomous AI agents, addressing a critical gap in current models' rigidity. By introducing CogRouter, the authors demonstrate a sophisticated understanding of cognitive processes and their application in AI systems. The two-stage training approach, grounded in ACT-R theory, provides a robust method for dynamic cognitive adaptation, which is essential for long-horizon tasks. The empirical results are impressive, showcasing significant improvements in success rates and token efficiency. However, the complexity of the framework and the need for further validation in diverse real-world scenarios are notable limitations. The practical implications of this research are substantial, offering potential benefits for industries relying on AI agents. Additionally, the policy implications highlight the importance of balancing innovation with resource efficiency, which is crucial for sustainable AI development. Overall, this article makes a significant contribution to the field and sets a new benchmark for future research in autonomous AI agents.

Recommendations

✓ Further research should focus on validating the CogRouter framework in a broader range of real-world applications to assess its generalizability and robustness.
✓ Efforts should be made to simplify the training process and reduce computational complexity to enhance the framework's accessibility and scalability.

Sources

arXiv - cs.AI

Think Fast and Slow: Step-Level Cognitive Depth Adaptation for LLM Agents

AI Commentary

Executive Summary

Key Points

Merits

Innovative Framework

Theoretical Foundation

Empirical Success

Demerits

Complexity

Generalizability

Token Efficiency

Expert Commentary

Recommendations

Sources

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs

JCG, PC

HSOLLC Co., Ltd.

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs