EQ-5D Classification Using Biomedical Entity-Enriched Pre-trained Language Models and Multiple Instance Learning
arXiv:2602.21216v1 Announce Type: cross Abstract: The EQ-5D (EuroQol 5-Dimensions) is a standardized instrument for the evaluation of health-related quality of life. In health economics, systematic literature reviews (SLRs) depend on the correct identification of publications that use the EQ-5D, but manual screening of large volumes of scientific literature is time-consuming, error-prone, and inconsistent. In this study, we investigate fine-tuning of general-purpose (BERT) and domain-specific (SciBERT, BioBERT) pre-trained language models (PLMs), enriched with biomedical entity information extracted through scispaCy models for each statement, to improve EQ-5D detection from abstracts. We conduct nine experimental setups, including combining three scispaCy models with three PLMs, and evaluate their performance at both the sentence and study levels. Furthermore, we explore a Multiple Instance Learning (MIL) approach with attention pooling to aggregate sentence-level information into stu
arXiv:2602.21216v1 Announce Type: cross Abstract: The EQ-5D (EuroQol 5-Dimensions) is a standardized instrument for the evaluation of health-related quality of life. In health economics, systematic literature reviews (SLRs) depend on the correct identification of publications that use the EQ-5D, but manual screening of large volumes of scientific literature is time-consuming, error-prone, and inconsistent. In this study, we investigate fine-tuning of general-purpose (BERT) and domain-specific (SciBERT, BioBERT) pre-trained language models (PLMs), enriched with biomedical entity information extracted through scispaCy models for each statement, to improve EQ-5D detection from abstracts. We conduct nine experimental setups, including combining three scispaCy models with three PLMs, and evaluate their performance at both the sentence and study levels. Furthermore, we explore a Multiple Instance Learning (MIL) approach with attention pooling to aggregate sentence-level information into study-level predictions, where each abstract is represented as a bag of enriched sentences (by scispaCy). The findings indicate consistent improvements in F1-scores (reaching 0.82) and nearly perfect recall at the study-level, significantly exceeding classical bag-of-words baselines and recently reported PLM baselines. These results show that entity enrichment significantly improves domain adaptation and model generalization, enabling more accurate automated screening in systematic reviews.
Executive Summary
This study investigates the use of pre-trained language models (PLMs) and biomedical entity information to improve the detection of EQ-5D, a standardized instrument for evaluating health-related quality of life, from scientific abstracts. The results show that entity enrichment significantly improves domain adaptation and model generalization, enabling more accurate automated screening in systematic reviews. The study achieves consistent improvements in F1-scores and nearly perfect recall at the study-level, outperforming classical baselines and recently reported PLM baselines.
Key Points
- ▸ Use of pre-trained language models (PLMs) for EQ-5D detection
- ▸ Incorporation of biomedical entity information to improve model performance
- ▸ Multiple Instance Learning (MIL) approach for aggregating sentence-level information
Merits
Improved Accuracy
The study achieves high F1-scores and nearly perfect recall at the study-level, indicating improved accuracy in EQ-5D detection.
Entity Enrichment
The incorporation of biomedical entity information significantly improves domain adaptation and model generalization.
Demerits
Limited Generalizability
The study's findings may not be generalizable to other domains or tasks, and further research is needed to validate the results.
Expert Commentary
This study demonstrates the potential of pre-trained language models and biomedical entity information to improve the accuracy and efficiency of systematic literature reviews. The results have significant implications for health economics research and decision-making, and highlight the importance of continued innovation in natural language processing and machine learning methods. However, further research is needed to validate the findings and explore their generalizability to other domains and tasks.
Recommendations
- ✓ Further research to validate the findings and explore their generalizability
- ✓ Investigation of the potential applications of the study's methods in other domains and tasks