DT-BEHRT: Disease Trajectory-aware Transformer for Interpretable Patient Representation Learning
arXiv:2603.10180v1 Announce Type: new Abstract: The growing adoption of electronic health record (EHR) systems has provided unprecedented opportunities for predictive modeling to guide clinical decision making. Structured EHRs contain longitudinal observations of patients across hospital visits, where each visit is represented by a set of medical codes. While sequence-based, graph-based, and graph-enhanced sequence approaches have been developed to capture rich code interactions over time or within the same visits, they often overlook the inherent heterogeneous roles of medical codes arising from distinct clinical characteristics and contexts. To this end, in this study we propose the Disease Trajectory-aware Transformer for EHR (DT-BEHRT), a graph-enhanced sequential architecture that disentangles disease trajectories by explicitly modeling diagnosis-centric interactions within organ systems and capturing asynchronous progression patterns. To further enhance the representation robust
arXiv:2603.10180v1 Announce Type: new Abstract: The growing adoption of electronic health record (EHR) systems has provided unprecedented opportunities for predictive modeling to guide clinical decision making. Structured EHRs contain longitudinal observations of patients across hospital visits, where each visit is represented by a set of medical codes. While sequence-based, graph-based, and graph-enhanced sequence approaches have been developed to capture rich code interactions over time or within the same visits, they often overlook the inherent heterogeneous roles of medical codes arising from distinct clinical characteristics and contexts. To this end, in this study we propose the Disease Trajectory-aware Transformer for EHR (DT-BEHRT), a graph-enhanced sequential architecture that disentangles disease trajectories by explicitly modeling diagnosis-centric interactions within organ systems and capturing asynchronous progression patterns. To further enhance the representation robustness, we design a tailored pre-training methodology that combines trajectory-level code masking with ontology-informed ancestor prediction, promoting semantic alignment across multiple modeling modules. Extensive experiments on multiple benchmark datasets demonstrate that DT-BEHRT achieves strong predictive performance and provides interpretable patient representations that align with clinicians' disease-centered reasoning. The source code is publicly accessible at https://github.com/GatorAIM/DT-BEHRT.git.
Executive Summary
This study proposes DT-BEHRT, a novel graph-enhanced sequential architecture for patient representation learning from electronic health records. DT-BEHRT explicitly models diagnosis-centric interactions within organ systems and captures asynchronous progression patterns, providing interpretable patient representations that align with clinicians' disease-centered reasoning. The architecture combines trajectory-level code masking with ontology-informed ancestor prediction to enhance representation robustness. Experiments on multiple benchmark datasets demonstrate strong predictive performance. The study's findings have significant implications for clinical decision-making, highlighting the potential of DT-BEHRT to improve patient outcomes through data-driven insights.
Key Points
- ▸ DT-BEHRT is a graph-enhanced sequential architecture for patient representation learning from electronic health records.
- ▸ The architecture explicitly models diagnosis-centric interactions within organ systems and captures asynchronous progression patterns.
- ▸ Experiments demonstrate strong predictive performance on multiple benchmark datasets.
Merits
Strength in Interpretable Patient Representations
DT-BEHRT provides interpretable patient representations that align with clinicians' disease-centered reasoning, facilitating more informed clinical decision-making.
Enhanced Representation Robustness
The combination of trajectory-level code masking and ontology-informed ancestor prediction enhances representation robustness, promoting semantic alignment across multiple modeling modules.
Demerits
Limited Generalizability to Non-structured EHRs
The study focuses on structured EHRs and may not generalize to non-structured EHRs or other data sources.
Dependence on High-Quality Annotation
The performance of DT-BEHRT may depend on high-quality annotation of medical codes and disease trajectories, which can be challenging to obtain.
Expert Commentary
The study's findings are significant, as they demonstrate the potential of DT-BEHRT to improve clinical decision-making through data-driven insights and interpretable patient representations. While the study has limitations, including limited generalizability to non-structured EHRs and dependence on high-quality annotation, the results are promising and warrant further investigation. The study's contributions to the development of AI-powered healthcare solutions and clinical decision-support systems are substantial, and its findings have the potential to inform policy decisions and resource allocation in healthcare settings.
Recommendations
- ✓ Future studies should investigate the generalizability of DT-BEHRT to non-structured EHRs and other data sources.
- ✓ The development of high-quality annotation protocols and tools is essential to facilitate the widespread adoption of DT-BEHRT and similar architectures.