Skip to main content
Academic

Integrating Machine Learning Ensembles and Large Language Models for Heart Disease Prediction Using Voting Fusion

arXiv:2602.22280v1 Announce Type: new Abstract: Cardiovascular disease is the primary cause of death globally, necessitating early identification, precise risk classification, and dependable decision-support technologies. The advent of large language models (LLMs) provides new zero-shot and few-shot reasoning capabilities, even though machine learning (ML) algorithms, especially ensemble approaches like Random Forest, XGBoost, LightGBM, and CatBoost, are excellent at modeling complex, non-linear patient data and routinely beat logistic regression. This research predicts cardiovascular disease using a merged dataset of 1,190 patient records, comparing traditional machine learning models (95.78% accuracy, ROC-AUC 0.96) with open-source large language models via OpenRouter APIs. Finally, a hybrid fusion of the ML ensemble and LLM reasoning under Gemini 2.5 Flash achieved the best results (96.62% accuracy, 0.97 AUC), showing that LLMs (78.9 % accuracy) work best when combined with ML mode

arXiv:2602.22280v1 Announce Type: new Abstract: Cardiovascular disease is the primary cause of death globally, necessitating early identification, precise risk classification, and dependable decision-support technologies. The advent of large language models (LLMs) provides new zero-shot and few-shot reasoning capabilities, even though machine learning (ML) algorithms, especially ensemble approaches like Random Forest, XGBoost, LightGBM, and CatBoost, are excellent at modeling complex, non-linear patient data and routinely beat logistic regression. This research predicts cardiovascular disease using a merged dataset of 1,190 patient records, comparing traditional machine learning models (95.78% accuracy, ROC-AUC 0.96) with open-source large language models via OpenRouter APIs. Finally, a hybrid fusion of the ML ensemble and LLM reasoning under Gemini 2.5 Flash achieved the best results (96.62% accuracy, 0.97 AUC), showing that LLMs (78.9 % accuracy) work best when combined with ML models rather than used alone. Results show that ML ensembles achieved the highest performance (95.78% accuracy, ROC-AUC 0.96), while LLMs performed moderately in zero-shot (78.9%) and slightly better in few-shot (72.6%) settings. The proposed hybrid method enhanced the strength in uncertain situations, illustrating that ensemble ML is considered the best structured tabular prediction case, but it can be integrated with hybrid ML-LLM systems to provide a minor increase and open the way to more reliable clinical decision-support tools.

Executive Summary

This article presents a novel approach to heart disease prediction by integrating machine learning ensembles and large language models. The study compares traditional machine learning models with open-source large language models and a hybrid fusion of the two, demonstrating improved accuracy and dependability in uncertain situations. The results show that machine learning ensembles achieve high performance, while large language models perform moderately in zero-shot and few-shot settings. The proposed hybrid method enhances the strength of the system, offering a reliable clinical decision-support tool. This study has significant implications for the development of early identification and risk classification technologies, with the potential to improve healthcare outcomes and reduce mortality rates. The findings also highlight the potential of integrating machine learning ensembles and large language models for prediction tasks, opening up new avenues for research and application.

Key Points

  • The study integrates machine learning ensembles and large language models for heart disease prediction
  • Machine learning ensembles achieve high performance (95.78% accuracy, ROC-AUC 0.96)
  • Large language models perform moderately in zero-shot (78.9%) and few-shot (72.6%) settings
  • Hybrid fusion of ML ensemble and LLM reasoning achieves best results (96.62% accuracy, 0.97 AUC)

Merits

Strength in uncertain situations

The proposed hybrid method enhances the strength of the system in uncertain situations, offering a reliable clinical decision-support tool.

Demerits

Limited dataset

The study uses a relatively small merged dataset of 1,190 patient records, which may limit the generalizability of the findings.

Dependence on large language model APIs

The study relies on open-source large language models via OpenRouter APIs, which may introduce dependencies and limitations on the scalability of the proposed method.

Expert Commentary

The study presents a novel approach to heart disease prediction by integrating machine learning ensembles and large language models. The findings demonstrate the potential of hybrid fusion of the two, offering a reliable clinical decision-support tool. However, the study is limited by a relatively small dataset and dependence on large language model APIs. Nevertheless, the results have significant implications for the development of early identification and risk classification technologies in healthcare. The study highlights the potential of integrating machine learning ensembles and large language models for prediction tasks, opening up new avenues for research and application. As the field of healthcare technology continues to evolve, it is essential to explore the potential of hybrid approaches like the one presented in this study.

Recommendations

  • Further research is needed to explore the potential of integrating machine learning ensembles and large language models for other prediction tasks.
  • The development of more robust and scalable hybrid methods is essential to improve the reliability and generalizability of the proposed approach.

Sources