优化中风风险预测：一个主要数据集驱动的集成分类器与可解释的人工智能

IF 2.1 Q2 MEDICINE, GENERAL & INTERNAL

Health Science Reports Pub Date : 2025-05-05 DOI:10.1002/hsr2.70799

Md. Maruf Hossain, Md. Mahfuz Ahmed, Md. Rakibul Hasan Rakib, Mohammad Osama Zia, Rakib Hasan, Md. Rakibul Islam, Md. Shohidul Islam, Md Shahariar Alam, Md. Khairul Islam

{"title":"优化中风风险预测：一个主要数据集驱动的集成分类器与可解释的人工智能","authors":"Md. Maruf Hossain, Md. Mahfuz Ahmed, Md. Rakibul Hasan Rakib, Mohammad Osama Zia, Rakib Hasan, Md. Rakibul Islam, Md. Shohidul Islam, Md Shahariar Alam, Md. Khairul Islam","doi":"10.1002/hsr2.70799","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background and Aims</h3>\n \n <p>Stroke remains a leading cause of mortality and long-term disability worldwide, presenting a significant global health challenge. Effective early prediction models are essential for reducing its impact. This study introduces a novel ensemble method for predicting stroke using two datasets: a primary dataset collected from a hospital, containing medical histories and clinical parameters, and a secondary dataset.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>We applied several preprocessing techniques, including outlier detection, data normalization, k-means clustering, and missing value detection, to refine the datasets. A novel ensemble classifier was developed, combining AdaBoost, Gradient Boosting Machine (GBM), Multilayer Perceptron (MLP), and Random Forest (RF) algorithms to enhance predictive accuracy. Additionally, Explainable Artificial Intelligence (XAI) techniques such as SHAP and LIME were integrated to elucidate key features influencing stroke prediction.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>The proposed ensemble classifier achieved an accuracy of 95% for the secondary dataset and 80.36% for the primary dataset. Comparative analysis with other machine learning models highlighted the superior performance of the ensemble approach. The integration of XAI further provided insights into the critical indicators influencing stroke classification, improving model interpretability and decision-making.</p>\n </section>\n \n <section>\n \n <h3> Conclusion</h3>\n \n <p>Our study demonstrates that the novel ensemble classifier, supported by effective preprocessing and XAI techniques, is a powerful tool for stroke prediction. The high accuracy rates achieved validate its effectiveness and potential for practical clinical application. Future work will focus on incorporating deep learning techniques and medical imaging to further improve classification accuracy and model performance.</p>\n </section>\n </div>","PeriodicalId":36518,"journal":{"name":"Health Science Reports","volume":"8 5","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/hsr2.70799","citationCount":"0","resultStr":"{\"title\":\"Optimizing Stroke Risk Prediction: A Primary Dataset-Driven Ensemble Classifier With Explainable Artificial Intelligence\",\"authors\":\"Md. Maruf Hossain, Md. Mahfuz Ahmed, Md. Rakibul Hasan Rakib, Mohammad Osama Zia, Rakib Hasan, Md. Rakibul Islam, Md. Shohidul Islam, Md Shahariar Alam, Md. Khairul Islam\",\"doi\":\"10.1002/hsr2.70799\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Background and Aims</h3>\\n \\n <p>Stroke remains a leading cause of mortality and long-term disability worldwide, presenting a significant global health challenge. Effective early prediction models are essential for reducing its impact. This study introduces a novel ensemble method for predicting stroke using two datasets: a primary dataset collected from a hospital, containing medical histories and clinical parameters, and a secondary dataset.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3>\\n \\n <p>We applied several preprocessing techniques, including outlier detection, data normalization, k-means clustering, and missing value detection, to refine the datasets. A novel ensemble classifier was developed, combining AdaBoost, Gradient Boosting Machine (GBM), Multilayer Perceptron (MLP), and Random Forest (RF) algorithms to enhance predictive accuracy. Additionally, Explainable Artificial Intelligence (XAI) techniques such as SHAP and LIME were integrated to elucidate key features influencing stroke prediction.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>The proposed ensemble classifier achieved an accuracy of 95% for the secondary dataset and 80.36% for the primary dataset. Comparative analysis with other machine learning models highlighted the superior performance of the ensemble approach. The integration of XAI further provided insights into the critical indicators influencing stroke classification, improving model interpretability and decision-making.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusion</h3>\\n \\n <p>Our study demonstrates that the novel ensemble classifier, supported by effective preprocessing and XAI techniques, is a powerful tool for stroke prediction. The high accuracy rates achieved validate its effectiveness and potential for practical clinical application. Future work will focus on incorporating deep learning techniques and medical imaging to further improve classification accuracy and model performance.</p>\\n </section>\\n </div>\",\"PeriodicalId\":36518,\"journal\":{\"name\":\"Health Science Reports\",\"volume\":\"8 5\",\"pages\":\"\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2025-05-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/hsr2.70799\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Health Science Reports\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/hsr2.70799\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MEDICINE, GENERAL & INTERNAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Health Science Reports","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/hsr2.70799","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}

引用次数: 0

摘要

背景和目的脑卒中仍然是世界范围内死亡和长期残疾的主要原因，是一项重大的全球健康挑战。有效的早期预测模型对于减少其影响至关重要。本研究介绍了一种新的集成方法，用于使用两个数据集预测中风：从医院收集的包含病史和临床参数的主要数据集和次要数据集。方法采用离群点检测、数据归一化、k均值聚类和缺失值检测等预处理技术对数据集进行细化。结合AdaBoost、Gradient Boosting Machine （GBM）、Multilayer Perceptron （MLP）和Random Forest （RF）算法，开发了一种新的集成分类器来提高预测精度。此外，可解释的人工智能（XAI）技术，如SHAP和LIME被整合来阐明影响中风预测的关键特征。结果所提出的集成分类器对二级数据集的分类准确率为95%，对主数据集的分类准确率为80.36%。与其他机器学习模型的比较分析突出了集成方法的优越性能。XAI的集成进一步提供了影响脑卒中分类的关键指标，提高了模型的可解释性和决策能力。结论本文的研究表明，在有效的预处理和XAI技术的支持下，新型集成分类器是脑卒中预测的有力工具。较高的准确率验证了该方法的有效性和临床应用潜力。未来的工作将集中在结合深度学习技术和医学成像，以进一步提高分类精度和模型性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Optimizing Stroke Risk Prediction: A Primary Dataset-Driven Ensemble Classifier With Explainable Artificial Intelligence

查看原文本刊更多论文

Optimizing Stroke Risk Prediction: A Primary Dataset-Driven Ensemble Classifier With Explainable Artificial Intelligence

Background and Aims

Stroke remains a leading cause of mortality and long-term disability worldwide, presenting a significant global health challenge. Effective early prediction models are essential for reducing its impact. This study introduces a novel ensemble method for predicting stroke using two datasets: a primary dataset collected from a hospital, containing medical histories and clinical parameters, and a secondary dataset.

Methods

We applied several preprocessing techniques, including outlier detection, data normalization, k-means clustering, and missing value detection, to refine the datasets. A novel ensemble classifier was developed, combining AdaBoost, Gradient Boosting Machine (GBM), Multilayer Perceptron (MLP), and Random Forest (RF) algorithms to enhance predictive accuracy. Additionally, Explainable Artificial Intelligence (XAI) techniques such as SHAP and LIME were integrated to elucidate key features influencing stroke prediction.

Results

The proposed ensemble classifier achieved an accuracy of 95% for the secondary dataset and 80.36% for the primary dataset. Comparative analysis with other machine learning models highlighted the superior performance of the ensemble approach. The integration of XAI further provided insights into the critical indicators influencing stroke classification, improving model interpretability and decision-making.

Conclusion

Our study demonstrates that the novel ensemble classifier, supported by effective preprocessing and XAI techniques, is a powerful tool for stroke prediction. The high accuracy rates achieved validate its effectiveness and potential for practical clinical application. Future work will focus on incorporating deep learning techniques and medical imaging to further improve classification accuracy and model performance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊