Academic

Thoth: Mid-Training Bridges LLMs to Time Series Understanding

arXiv:2603.01042v1 Announce Type: new Abstract: Large Language Models (LLMs) have demonstrated remarkable success in general-purpose reasoning. However, they still struggle to understand and reason about time series data, which limits their effectiveness in decision-making scenarios that depend on temporal dynamics. In this paper, we propose Thoth, the first family of mid-trained LLMs with general-purpose time series understanding capabilities. As a pivotal intermediate stage, mid-training achieves task- and domain-agnostic alignment between time series and natural language, for which we construct Book-of-Thoth, a high-quality, time-series-centric mid-training corpus. Book-of-Thoth enables both time-series-to-text and text-to-time-series generation, equipping LLMs with a foundational grasp of temporal patterns. To better evaluate advanced reasoning capabilities, we further present KnoTS, a novel benchmark of knowledge-intensive time series understanding, designed for joint reasoning o

arXiv:2603.01042v1 Announce Type: new Abstract: Large Language Models (LLMs) have demonstrated remarkable success in general-purpose reasoning. However, they still struggle to understand and reason about time series data, which limits their effectiveness in decision-making scenarios that depend on temporal dynamics. In this paper, we propose Thoth, the first family of mid-trained LLMs with general-purpose time series understanding capabilities. As a pivotal intermediate stage, mid-training achieves task- and domain-agnostic alignment between time series and natural language, for which we construct Book-of-Thoth, a high-quality, time-series-centric mid-training corpus. Book-of-Thoth enables both time-series-to-text and text-to-time-series generation, equipping LLMs with a foundational grasp of temporal patterns. To better evaluate advanced reasoning capabilities, we further present KnoTS, a novel benchmark of knowledge-intensive time series understanding, designed for joint reasoning over temporal patterns and domain knowledge. Extensive experiments demonstrate that mid-training with Book-of-Thoth enables Thoth to significantly outperform its base model and advanced LLMs across a range of time series question answering benchmarks. Moreover, Thoth exhibits superior capabilities when fine-tuned under data scarcity, underscoring the effectiveness of mid-training for time series understanding. Code is available at: https://github.com/thuml/Thoth.

Executive Summary

This article proposes Thoth, a novel family of mid-trained Large Language Models (LLMs) capable of understanding time series data. Thoth bridges the gap between natural language and time series through 'mid-training' with a high-quality corpus called Book-of-Thoth. The authors also introduce KnoTS, a benchmark for evaluating advanced reasoning capabilities in time series understanding. Experimental results show that Thoth significantly outperforms its base model and other LLMs in time series question answering tasks, even under data scarcity. This breakthrough has significant implications for decision-making scenarios that rely on temporal dynamics.

Key Points

  • Thoth is the first mid-trained LLM with general-purpose time series understanding capabilities.
  • Mid-training with Book-of-Thoth enables Thoth to significantly outperform its base model and other LLMs in time series question answering tasks.
  • KnoTS is a novel benchmark for evaluating advanced reasoning capabilities in time series understanding.

Merits

Strength in Time Series Understanding

Thoth's mid-training with Book-of-Thoth enables it to grasp temporal patterns and understand time series data, which is a significant improvement over its base model and other LLMs.

Robustness under Data Scarcity

Thoth's ability to perform well under data scarcity underscores its effectiveness in real-world decision-making scenarios where data availability is limited.

Demerits

Limitation in Domain-Specific Knowledge

While Thoth excels in general-purpose time series understanding, its performance may not be comparable to specialized models when dealing with domain-specific knowledge.

Dependence on High-Quality Corpus

Thoth's performance relies heavily on the quality of the Book-of-Thoth corpus, which may limit its applicability in scenarios where high-quality training data is not available.

Expert Commentary

The proposed Thoth model represents a significant breakthrough in the field of time series understanding. By bridging the gap between natural language and time series data, Thoth has the potential to transform decision-making scenarios that rely on temporal dynamics. However, its dependence on high-quality training data and potential vulnerability to adversarial attacks are limitations that need to be addressed. Furthermore, the development of Thoth raises important questions about the explainability and transparency of AI models, particularly in high-stakes applications. As this technology continues to evolve, it is essential to prioritize research in these areas to ensure the safe and responsible deployment of Thoth and similar models.

Recommendations

  • Future research should focus on developing explainable and transparent AI models that can provide insights into their decision-making processes, particularly in high-stakes applications.
  • Researchers should prioritize the development of robust and secure AI models that can withstand adversarial attacks on time series data.

Sources