Academic

Cast-R1: Learning Tool-Augmented Sequential Decision Policies for Time Series Forecasting

Xiaoyu Tao, Mingyue Cheng, Chuang Jiang, Tian Gao, Huanjian Zhang, Yaguo Liu · February 18, 2026 · 1 min read · 6 views

#cs.LG

arXiv:2602.13802v1 Announce Type: new Abstract: Time series forecasting has long been dominated by model-centric approaches that formulate prediction as a single-pass mapping from historical observations to future values. Despite recent progress, such formulations often struggle in complex and evolving settings, largely because most forecasting models lack the ability to autonomously acquire informative evidence, reason about potential future changes, or revise predictions through iterative decision processes. In this work, we propose Cast-R1, a learned time series forecasting framework that reformulates forecasting as a sequential decision-making problem. Cast-R1 introduces a memory-based state management mechanism that maintains decision-relevant information across interaction steps, enabling the accumulation of contextual evidence to support long-horizon reasoning. Building on this formulation, forecasting is carried out through a tool-augmented agentic workflow, in which the agent autonomously interacts with a modular toolkit to extract statistical features, invoke lightweight forecasting models for decision support, perform reasoning-based prediction, and iteratively refine forecasts through self-reflection. To train Cast-R1, we adopt a two-stage learning strategy that combines supervised fine-tuning with multi-turn reinforcement learning, together with a curriculum learning scheme that progressively increases task difficulty to improve policy learning. Extensive experiments on multiple real-world time series datasets demonstrate the effectiveness of Cast-R1. We hope this work provides a practical step towards further exploration of agentic paradigms for time series modeling. Our code is available at https://github.com/Xiaoyu-Tao/Cast-R1-TS.

Executive Summary

The article introduces Cast-R1, a novel framework for time series forecasting that reformulates the problem as a sequential decision-making process. Unlike traditional model-centric approaches, Cast-R1 employs a memory-based state management mechanism and a tool-augmented agentic workflow to autonomously acquire evidence, reason about future changes, and iteratively refine predictions. The framework is trained using a two-stage learning strategy combining supervised fine-tuning and multi-turn reinforcement learning, enhanced by a curriculum learning scheme. Extensive experiments on real-world datasets demonstrate its effectiveness, suggesting a promising direction for agentic paradigms in time series modeling.

Key Points

▸ Cast-R1 reformulates time series forecasting as a sequential decision-making problem.
▸ The framework uses a memory-based state management mechanism to accumulate contextual evidence.
▸ A tool-augmented agentic workflow allows the agent to autonomously interact with a modular toolkit for decision support and prediction refinement.
▸ Training involves a two-stage strategy combining supervised fine-tuning and multi-turn reinforcement learning with curriculum learning.
▸ Extensive experiments on real-world datasets validate the effectiveness of Cast-R1.

Merits

Innovative Approach

Cast-R1 introduces a novel approach to time series forecasting by treating it as a sequential decision-making problem, which allows for more flexible and adaptive predictions.

Autonomous Evidence Acquisition

The framework's ability to autonomously acquire informative evidence and reason about potential future changes sets it apart from traditional model-centric approaches.

Effective Training Strategy

The two-stage learning strategy, combined with curriculum learning, enhances the policy learning process, making the model more robust and adaptable.

Demerits

Complexity

The complexity of the framework may pose challenges in terms of computational resources and implementation, potentially limiting its accessibility.

Generalizability

While the framework shows promise, its effectiveness across diverse datasets and real-world applications needs further validation.

Training Time

The multi-turn reinforcement learning and curriculum learning strategies may increase training time, which could be a drawback for time-sensitive applications.

Expert Commentary

The introduction of Cast-R1 represents a significant advancement in the field of time series forecasting. By reformulating the problem as a sequential decision-making process, the framework addresses several limitations of traditional model-centric approaches. The memory-based state management mechanism and tool-augmented agentic workflow enable the model to autonomously acquire evidence and reason about future changes, which is a notable improvement over static prediction models. The two-stage learning strategy, combined with curriculum learning, enhances the robustness and adaptability of the model. However, the complexity of the framework and the potential increase in training time are important considerations. Despite these limitations, the framework's effectiveness, as demonstrated by extensive experiments on real-world datasets, suggests a promising direction for future research in agentic paradigms for time series modeling. The practical implications of Cast-R1 are substantial, with potential applications in various industries requiring accurate time series forecasting. Additionally, the development of such agentic paradigms may influence policy discussions on the ethical and regulatory aspects of autonomous decision-making systems.

Recommendations

✓ Further validation of Cast-R1's effectiveness across diverse datasets and real-world applications is recommended to ensure its generalizability.
✓ Exploring ways to simplify the framework and reduce training time could make it more accessible and practical for a wider range of applications.

Sources

arXiv - cs.LG

Something extraordinary is coming.

Cast-R1: Learning Tool-Augmented Sequential Decision Policies for Time Series Forecasting

AI Commentary

Executive Summary

Key Points

Merits

Innovative Approach

Autonomous Evidence Acquisition

Effective Training Strategy

Demerits

Complexity

Generalizability

Training Time

Expert Commentary

Recommendations

Sources

Related Articles

How Large Language Models Get Stuck: Early structure with persistent …

Distribution-Aware Companding Quantization of Large Language Models

Policy Compliance of User Requests in Natural Language for AI …

LLM-Bootstrapped Targeted Finding Guidance for Factual MLLM-based Medical Report Generation

JCG, PC

HSOLLC Co., Ltd.