Skip to main content
Academic

Decentralized Attention Fails Centralized Signals: Rethinking Transformers for Medical Time Series

arXiv:2602.18473v1 Announce Type: new Abstract: Accurate analysis of medical time series (MedTS) data, such as electroencephalography (EEG) and electrocardiography (ECG), plays a pivotal role in healthcare applications, including the diagnosis of brain and heart diseases. MedTS data typically exhibit two critical patterns: temporal dependencies within individual channels and channel dependencies across multiple channels. While recent advances in deep learning have leveraged Transformer-based models to effectively capture temporal dependencies, they often struggle with modeling channel dependencies. This limitation stems from a structural mismatch: MedTS signals are inherently centralized, whereas the Transformer's attention mechanism is decentralized, making it less effective at capturing global synchronization and unified waveform patterns. To address this mismatch, we propose CoTAR (Core Token Aggregation-Redistribution), a centralized MLP-based module designed to replace decentrali

arXiv:2602.18473v1 Announce Type: new Abstract: Accurate analysis of medical time series (MedTS) data, such as electroencephalography (EEG) and electrocardiography (ECG), plays a pivotal role in healthcare applications, including the diagnosis of brain and heart diseases. MedTS data typically exhibit two critical patterns: temporal dependencies within individual channels and channel dependencies across multiple channels. While recent advances in deep learning have leveraged Transformer-based models to effectively capture temporal dependencies, they often struggle with modeling channel dependencies. This limitation stems from a structural mismatch: MedTS signals are inherently centralized, whereas the Transformer's attention mechanism is decentralized, making it less effective at capturing global synchronization and unified waveform patterns. To address this mismatch, we propose CoTAR (Core Token Aggregation-Redistribution), a centralized MLP-based module designed to replace decentralized attention. Instead of allowing all tokens to interact directly, as in standard attention, CoTAR introduces a global core token that serves as a proxy to facilitate inter-token interactions, thereby enforcing a centralized aggregation and redistribution strategy. This design not only better aligns with the centralized nature of MedTS signals but also reduces computational complexity from quadratic to linear. Experiments on five benchmarks validate the superiority of our method in both effectiveness and efficiency, achieving up to a 12.13% improvement on the APAVA dataset, while using only 33% of the memory and 20% of the inference time compared to the previous state of the art. Code and all training scripts are available at https://github.com/Levi-Ackman/TeCh.

Executive Summary

This article introduces CoTAR, a centralized MLP-based module designed to improve the analysis of medical time series data. By leveraging a global core token to facilitate inter-token interactions, CoTAR addresses the structural mismatch between decentralized Transformer attention and centralized MedTS signals. Experiments on five benchmarks demonstrate CoTAR's superiority in effectiveness and efficiency, achieving up to a 12.13% improvement on the APAVA dataset. The proposed method reduces computational complexity from quadratic to linear, and code and training scripts are available. This research has significant implications for healthcare applications, including diagnosis and disease monitoring.

Key Points

  • The article proposes CoTAR, a centralized MLP-based module to address the limitation of decentralized Transformer attention in medical time series analysis.
  • CoTAR introduces a global core token to facilitate inter-token interactions, aligning with the centralized nature of MedTS signals.
  • Experiments demonstrate CoTAR's superiority in effectiveness and efficiency, achieving significant improvements on five benchmarks.

Merits

Strength in addressing structural mismatch

The proposed CoTAR module effectively addresses the limitation of decentralized Transformer attention in medical time series analysis, leveraging a global core token to facilitate inter-token interactions.

Efficiency improvements

CoTAR reduces computational complexity from quadratic to linear, making it more efficient than previous state-of-the-art methods.

Availability of code and training scripts

The authors provide open-source code and training scripts, facilitating reproducibility and further research in this area.

Demerits

Limited scope to other applications

The proposed method is specifically designed for medical time series analysis and may not be directly applicable to other domains.

Dependence on global core token

The effectiveness of CoTAR relies on the accurate selection and representation of the global core token, which may be challenging in certain scenarios.

Expert Commentary

The article presents a compelling case for the limitations of decentralized Transformer attention in medical time series analysis and proposes a novel solution, CoTAR, to address this issue. The proposed method demonstrates significant improvements in effectiveness and efficiency, making it a valuable contribution to the field. However, the reliance on a global core token may be a limitation in certain scenarios, and further research is needed to explore the generalizability of CoTAR to other domains.

Recommendations

  • Future research should explore the application of CoTAR to other medical time series analysis tasks, such as heart rate variability and blood pressure monitoring.
  • The authors should investigate the potential for CoTAR to be adapted for other domains, such as finance and climate modeling, where similar structural mismatches may occur.

Sources