Skip to main content
Academic

MantisV2: Closing the Zero-Shot Gap in Time Series Classification with Synthetic Data and Test-Time Strategies

arXiv:2602.17868v1 Announce Type: cross Abstract: Developing foundation models for time series classification is of high practical relevance, as such models can serve as universal feature extractors for diverse downstream tasks. Although early models such as Mantis have shown the promise of this approach, a substantial performance gap remained between frozen and fine-tuned encoders. In this work, we introduce methods that significantly strengthen zero-shot feature extraction for time series. First, we introduce Mantis+, a variant of Mantis pre-trained entirely on synthetic time series. Second, through controlled ablation studies, we refine the architecture and obtain MantisV2, an improved and more lightweight encoder. Third, we propose an enhanced test-time methodology that leverages intermediate-layer representations and refines output-token aggregation. In addition, we show that performance can be further improved via self-ensembling and cross-model embedding fusion. Extensive exper

arXiv:2602.17868v1 Announce Type: cross Abstract: Developing foundation models for time series classification is of high practical relevance, as such models can serve as universal feature extractors for diverse downstream tasks. Although early models such as Mantis have shown the promise of this approach, a substantial performance gap remained between frozen and fine-tuned encoders. In this work, we introduce methods that significantly strengthen zero-shot feature extraction for time series. First, we introduce Mantis+, a variant of Mantis pre-trained entirely on synthetic time series. Second, through controlled ablation studies, we refine the architecture and obtain MantisV2, an improved and more lightweight encoder. Third, we propose an enhanced test-time methodology that leverages intermediate-layer representations and refines output-token aggregation. In addition, we show that performance can be further improved via self-ensembling and cross-model embedding fusion. Extensive experiments on UCR, UEA, Human Activity Recognition (HAR) benchmarks, and EEG datasets show that MantisV2 and Mantis+ consistently outperform prior time series foundation models, achieving state-of-the-art zero-shot performance.

Executive Summary

This article presents MantisV2, an improved variant of the Mantis time series classification model, which closes the zero-shot gap in performance with the aid of synthetic data and test-time strategies. MantisV2 and its predecessor Mantis+ exhibit state-of-the-art zero-shot performance on multiple benchmarks, surpassing prior time series foundation models. Through controlled ablation studies and refined architecture, MantisV2 demonstrates enhanced performance and a more lightweight encoder. Additionally, the authors propose a novel test-time methodology, self-ensembling, and cross-model embedding fusion to further improve performance. Extensive experiments validate the efficacy of MantisV2 and Mantis+, underscoring their potential as universal feature extractors for diverse downstream tasks.

Key Points

  • MantisV2 closes the zero-shot gap in time series classification performance.
  • MantisV2 and Mantis+ exhibit state-of-the-art zero-shot performance on multiple benchmarks.
  • Refined architecture and test-time methodology enhance performance and model efficiency.

Merits

Improved Performance

MantisV2 and Mantis+ demonstrate significantly better performance on zero-shot tasks, closing the gap with fine-tuned encoders.

Enhanced Model Efficiency

MantisV2 features a more lightweight encoder and refined architecture, making it more efficient and scalable for diverse downstream tasks.

Novel Methodologies

The authors introduce novel test-time strategies, self-ensembling, and cross-model embedding fusion, which further improve performance and model robustness.

Demerits

Limited Real-World Application

While MantisV2 and Mantis+ demonstrate impressive performance on synthetic and benchmark datasets, their real-world applicability and scalability remain to be demonstrated.

Dependence on Synthetic Data

MantisV2's performance relies heavily on synthetic data, which may not accurately reflect real-world time series characteristics and complexities.

Expert Commentary

MantisV2 and Mantis+ represent a significant advancement in time series classification, leveraging synthetic data and novel methodologies to close the zero-shot gap in performance. While the models demonstrate impressive performance on benchmark datasets, their real-world applicability and scalability remain to be fully explored. Nevertheless, the authors' innovative approaches and thorough evaluations underscore the potential of MantisV2 and Mantis+ as universal feature extractors for diverse downstream tasks. As AI research continues to evolve, the development of such models will be crucial in addressing the complexities and challenges of real-world time series classification.

Recommendations

  • Future research should focus on exploring the real-world applicability and scalability of MantisV2 and Mantis+, including their performance on diverse datasets and domains.
  • Investigating the limitations and challenges of synthetic data in real-world time series classification will be essential in refining and improving MantisV2 and Mantis+.

Sources