Enhancing heart disease prediction with stacked ensemble and MCDM-based ranking: an optimized RST-ML approach.

IF 3.2 Q1 HEALTH CARE SCIENCES & SERVICES

Frontiers in digital health Pub Date : 2025-06-19 eCollection Date: 2025-01-01 DOI:10.3389/fdgth.2025.1609308

T Ashika, G Hannah Grace

{"title":"Enhancing heart disease prediction with stacked ensemble and MCDM-based ranking: an optimized RST-ML approach.","authors":"T Ashika, G Hannah Grace","doi":"10.3389/fdgth.2025.1609308","DOIUrl":null,"url":null,"abstract":"Introduction: Cardiovascular disease (CVD) is a leading global cause of death, necessitating the development of accurate diagnostic models. This study presents an Optimized Rough Set Theory-Machine Learning (RST-ML) framework that integrates Multi-Criteria Decision-Making (MCDM) for effective heart disease (HD) prediction. By utilizing RST for feature selection, the framework minimizes dimensionality while retaining essential information.Methods: The framework employs RST to select relevant features, followed by the integration of nine ML classifiers into five stacked ensemble models through correlation analysis to enhance predictive accuracy and reduce overfitting. The Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) ranks the models, with weights assigned using the Mean Rank Error Correction (MEREC) method. Hyperparameter tuning for the top model, Stack-4, was conducted using GridSearchCV, identifying XGBoost (XG) as the most effective classifier. To assess scalability and generalization, the framework was evaluated using additional datasets, including chronic kidney disease (CKD), obesity levels, and breast cancer. Explainable AI (XAI) techniques were also applied to clarify feature importance and decision-making processes.Results: Stack-4 emerged as the highest-performing model, with XGBoost achieving the best predictive accuracy. The application of XAI techniques provided insights into the model's decision-making, highlighting key features influencing predictions.Discussion: The findings demonstrate the effectiveness of the RST-ML framework in improving HD prediction accuracy. The successful application to diverse datasets indicates strong scalability and generalization potential, making the framework a robust and scalable solution for timely diagnosis across various health conditions.","PeriodicalId":73078,"journal":{"name":"Frontiers in digital health","volume":"7 ","pages":"1609308"},"PeriodicalIF":3.2000,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12222165/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in digital health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fdgth.2025.1609308","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

Abstract

Introduction: Cardiovascular disease (CVD) is a leading global cause of death, necessitating the development of accurate diagnostic models. This study presents an Optimized Rough Set Theory-Machine Learning (RST-ML) framework that integrates Multi-Criteria Decision-Making (MCDM) for effective heart disease (HD) prediction. By utilizing RST for feature selection, the framework minimizes dimensionality while retaining essential information.

Methods: The framework employs RST to select relevant features, followed by the integration of nine ML classifiers into five stacked ensemble models through correlation analysis to enhance predictive accuracy and reduce overfitting. The Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) ranks the models, with weights assigned using the Mean Rank Error Correction (MEREC) method. Hyperparameter tuning for the top model, Stack-4, was conducted using GridSearchCV, identifying XGBoost (XG) as the most effective classifier. To assess scalability and generalization, the framework was evaluated using additional datasets, including chronic kidney disease (CKD), obesity levels, and breast cancer. Explainable AI (XAI) techniques were also applied to clarify feature importance and decision-making processes.

Results: Stack-4 emerged as the highest-performing model, with XGBoost achieving the best predictive accuracy. The application of XAI techniques provided insights into the model's decision-making, highlighting key features influencing predictions.

Discussion: The findings demonstrate the effectiveness of the RST-ML framework in improving HD prediction accuracy. The successful application to diverse datasets indicates strong scalability and generalization potential, making the framework a robust and scalable solution for timely diagnosis across various health conditions.

查看原文本刊更多论文

利用堆叠集成和基于mcdm的排序增强心脏病预测：一种优化的RST-ML方法。

导言：心血管疾病（CVD）是全球主要的死亡原因，需要开发准确的诊断模型。本研究提出了一个优化的粗糙集理论-机器学习（RST-ML）框架，该框架集成了多标准决策（MCDM），用于有效的心脏病（HD）预测。通过使用RST进行特征选择，该框架在保留基本信息的同时最小化了维度。方法：该框架采用RST选择相关特征，通过相关分析将9个ML分类器整合到5个堆叠集成模型中，提高预测精度，减少过拟合。采用理想解相似性偏好排序技术（TOPSIS）对模型进行排序，并使用平均秩误差校正（MEREC）方法分配权重。使用GridSearchCV对顶层模型Stack-4进行超参数调优，确定XGBoost （XG）是最有效的分类器。为了评估可扩展性和通用性，使用其他数据集对该框架进行了评估，包括慢性肾脏疾病（CKD）、肥胖水平和乳腺癌。可解释的人工智能（XAI）技术也被用于澄清特征的重要性和决策过程。结果：Stack-4是表现最好的模型，XGBoost达到了最好的预测精度。XAI技术的应用为模型的决策提供了见解，突出了影响预测的关键特征。讨论：研究结果证明了RST-ML框架在提高HD预测精度方面的有效性。对不同数据集的成功应用表明了强大的可扩展性和推广潜力，使该框架成为跨各种健康状况及时诊断的鲁棒性和可扩展性解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊