MedFeat: Model-Aware and Explainability-Driven Feature Engineering with LLMs for Clinical Tabular Prediction
arXiv:2603.02221v1 Announce Type: new Abstract: In healthcare tabular predictions, classical models with feature engineering often outperform neural approaches. Recent advances in Large Language Models enable the integration of domain knowledge into feature engineering, offering a promising direction. However, existing approaches typically rely on a broad search over predefined transformations, overlooking downstream model characteristics and feature importance signals. We present MedFeat, a feedback-driven and model-aware feature engineering framework that leverages LLM reasoning with domain knowledge and provides feature explanations based on SHAP values while tracking successful and failed proposals to guide feature discovery. By incorporating model awareness, MedFeat prioritizes informative signals that are difficult for the downstream model to learn directly due to its characteristics. Across a broad range of clinical prediction tasks, MedFeat achieves stable improvements over va
arXiv:2603.02221v1 Announce Type: new Abstract: In healthcare tabular predictions, classical models with feature engineering often outperform neural approaches. Recent advances in Large Language Models enable the integration of domain knowledge into feature engineering, offering a promising direction. However, existing approaches typically rely on a broad search over predefined transformations, overlooking downstream model characteristics and feature importance signals. We present MedFeat, a feedback-driven and model-aware feature engineering framework that leverages LLM reasoning with domain knowledge and provides feature explanations based on SHAP values while tracking successful and failed proposals to guide feature discovery. By incorporating model awareness, MedFeat prioritizes informative signals that are difficult for the downstream model to learn directly due to its characteristics. Across a broad range of clinical prediction tasks, MedFeat achieves stable improvements over various baselines and discovers clinically meaningful features that generalize under distribution shift, demonstrating robustness across years and from ICU cohorts to general hospitalized patients, thereby offering insights into real-world deployment. Code required to reproduce our experiments will be released, subject to dataset agreements and institutional policies.
Executive Summary
This article presents MedFeat, a novel feature engineering framework that leverages Large Language Models (LLMs) to improve clinical tabular predictions. MedFeat is a feedback-driven, model-aware approach that integrates domain knowledge and provides feature explanations based on SHAP values. Compared to existing methods, MedFeat achieves stable improvements over various baselines and discovers clinically meaningful features that generalize under distribution shift. The framework's robustness across different clinical prediction tasks and datasets demonstrates its potential for real-world deployment. The authors will release the code and datasets used in their experiments, subject to certain agreements and policies. Overall, MedFeat is a significant advancement in the field of clinical tabular predictions, offering insights into the integration of LLMs and feature engineering for improved healthcare outcomes.
Key Points
- ▸ MedFeat is a model-aware and explainability-driven feature engineering framework that leverages LLMs for clinical tabular predictions.
- ▸ The framework integrates domain knowledge and provides feature explanations based on SHAP values.
- ▸ MedFeat achieves stable improvements over various baselines and discovers clinically meaningful features that generalize under distribution shift.
Merits
Strength in Model Awareness
MedFeat's model-aware approach enables it to prioritize informative signals that are difficult for the downstream model to learn directly, resulting in more accurate predictions.
Explainability through SHAP Values
The use of SHAP values provides feature explanations that help clinicians understand the importance of each feature in the prediction model, leading to more informed decision-making.
Robustness and Generalizability
MedFeat's ability to generalize under distribution shift and perform well across different clinical prediction tasks and datasets demonstrates its robustness and potential for real-world deployment.
Demerits
Limited Scope and Diversity of Datasets
The study's focus on a limited range of clinical prediction tasks and datasets may not fully generalize to other healthcare domains and datasets.
Dependence on LLMs and Domain Knowledge
MedFeat's reliance on LLMs and domain knowledge may limit its applicability to domains with limited access to such resources or expertise.
Potential for Overfitting and Over-Engineering
The feedback-driven approach of MedFeat may lead to overfitting or over-engineering if not carefully monitored and controlled.
Expert Commentary
The article presents a novel and promising approach to feature engineering in clinical tabular predictions, leveraging the capabilities of LLMs to integrate domain knowledge and provide feature explanations. While the study's results are encouraging, the potential limitations and challenges of MedFeat, such as dependence on LLMs and domain knowledge, need to be carefully considered. The article's contributions to the discussion on AI and clinical decision-making, explainability in AI-driven predictions, and domain adaptation and generalizability in AI are significant and timely. Overall, MedFeat is a valuable addition to the field of clinical tabular predictions, offering insights into the integration of LLMs and feature engineering for improved healthcare outcomes.
Recommendations
- ✓ Future research should focus on extending the scope and diversity of datasets used in MedFeat to better understand its generalizability across different healthcare domains and datasets.
- ✓ The development of more robust and interpretable feature engineering frameworks, such as MedFeat, is essential for the effective integration of AI and clinical decision-making in healthcare.