Academic

When to Think Fast and Slow? AMOR: Entropy-Based Metacognitive Gate for Dynamic SSM-Attention Switching

arXiv:2602.13215v1 Announce Type: new Abstract: Transformers allocate uniform computation to every position, regardless of difficulty. State Space Models (SSMs) offer efficient alternatives but struggle with precise information retrieval over a long horizon. Inspired by dual-process theories of cognition (Kahneman, 2011), we propose AMOR (Adaptive Metacognitive Output Router), a hybrid architecture that dynamically engages sparse attention only when an SSM backbone is "uncertain"--as measured by prediction entropy. Compared to standard transformers, AMOR gains efficiency by projecting keys and values from SSM hidden states (Ghost KV), reusing the SSM's O(n) computation rather than requiring O(n^2) attention at every layer. On small-scale synthetic retrieval tasks, AMOR outperforms both SSM-only and transformer-only baselines, achieving perfect retrieval accuracy while engaging attention on only 22% of positions. We validate that prediction entropy reliably signals retrieval need, with

Haoran Zheng · March 7, 2026 · 1 min read · 16 views

#cs.AI

Executive Summary

The article 'When to Think Fast and Slow? AMOR: Entropy-Based Metacognitive Gate for Dynamic SSM-Attention Switching' introduces AMOR, a hybrid architecture that dynamically switches between State Space Models (SSMs) and sparse attention mechanisms based on prediction entropy. Inspired by dual-process theories of cognition, AMOR aims to improve computational efficiency and accuracy in information retrieval tasks. The study demonstrates that AMOR outperforms both SSM-only and transformer-only baselines by engaging attention only when the SSM backbone is uncertain, as measured by prediction entropy. This approach not only enhances efficiency but also provides interpretable adaptive computation, making it a promising advancement in the field of machine learning and cognitive modeling.

Key Points

▸ AMOR dynamically switches between SSMs and sparse attention based on prediction entropy.
▸ AMOR achieves perfect retrieval accuracy on small-scale synthetic tasks while using attention on only 22% of positions.
▸ Prediction entropy reliably signals the need for retrieval, with a significant gap between retrieval and local positions.

Merits

Efficiency

AMOR significantly reduces computational overhead by engaging attention only when necessary, leveraging the O(n) computation of SSMs instead of the O(n^2) attention required by transformers.

Accuracy

AMOR achieves perfect retrieval accuracy on synthetic tasks, demonstrating its effectiveness in information retrieval.

Interpretability

The adaptive computation in AMOR is interpretable in information-theoretic terms, providing a clear understanding of routing decisions.

Demerits

Limited Scope

The study is limited to small-scale synthetic retrieval tasks, and its performance on larger, more complex datasets remains untested.

Complexity

The hybrid architecture of AMOR introduces additional complexity, which may pose challenges in implementation and scalability.

Entropy Measurement

The reliability of prediction entropy as a sole metric for determining the need for attention may be questioned, as it might not capture all nuances of uncertainty.

Expert Commentary

The article presents a novel and innovative approach to improving computational efficiency and accuracy in information retrieval tasks. By leveraging dual-process theories of cognition, AMOR dynamically switches between SSMs and sparse attention mechanisms based on prediction entropy. This not only enhances efficiency but also provides interpretable adaptive computation. The study's findings are promising, particularly the achievement of perfect retrieval accuracy on synthetic tasks while using attention on only 22% of positions. However, the study's limitations, such as its focus on small-scale synthetic tasks and the complexity of the hybrid architecture, should be addressed in future research. Additionally, the reliability of prediction entropy as a sole metric for determining the need for attention warrants further investigation. Overall, AMOR represents a significant advancement in the field of machine learning and cognitive modeling, with potential applications in various domains where efficient and accurate information retrieval is crucial.

Recommendations

✓ Future research should validate AMOR's performance on larger, more complex datasets to assess its scalability and robustness.
✓ Further studies should explore the use of additional metrics beyond prediction entropy to better capture the nuances of uncertainty in information retrieval tasks.

Sources

arXiv - cs.AI

When to Think Fast and Slow? AMOR: Entropy-Based Metacognitive Gate for Dynamic SSM-Attention Switching

AI Commentary

Executive Summary

Key Points

Merits

Efficiency

Accuracy

Interpretability

Demerits

Limited Scope

Complexity

Entropy Measurement

Expert Commentary

Recommendations

Sources

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs

JCG, PC

HSOLLC Co., Ltd.

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs