{"title":"基于合成少数派过采样技术和极端梯度增强的脑卒中预测。","authors":"Mahdi Hassan, Hamid Nasiri, Mona Esmaeili, Morteza Dorrigiv","doi":"10.1080/10255842.2025.2570510","DOIUrl":null,"url":null,"abstract":"<p><p>Stroke is the world's second leading cause of death; its early prediction benefits from interpretable, high-accuracy models that can guide prevention and care. Using the Kaggle stroke dataset, we applied SMOTE to balance class distribution and trained XGBoost, Random Forest, LightGBM, CatBoost, and SVM models. XGBoost achieved 97.26% accuracy with robust 10-fold cross-validation, outperforming prior baselines. Model outputs were interpreted using the Shapley Additive Explanations (SHAP) algorithm, which identified age and hypertension/blood pressure as dominant predictors, providing both case-level insights and global feature rankings. The proposed pipeline offers practical, interpretable stroke-risk prediction with state-of-the-art performance suitable for clinical decision support.</p>","PeriodicalId":50640,"journal":{"name":"Computer Methods in Biomechanics and Biomedical Engineering","volume":" ","pages":"1-12"},"PeriodicalIF":1.6000,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Stroke prediction using synthetic minority over-sampling technique and extreme gradient boosting.\",\"authors\":\"Mahdi Hassan, Hamid Nasiri, Mona Esmaeili, Morteza Dorrigiv\",\"doi\":\"10.1080/10255842.2025.2570510\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Stroke is the world's second leading cause of death; its early prediction benefits from interpretable, high-accuracy models that can guide prevention and care. Using the Kaggle stroke dataset, we applied SMOTE to balance class distribution and trained XGBoost, Random Forest, LightGBM, CatBoost, and SVM models. XGBoost achieved 97.26% accuracy with robust 10-fold cross-validation, outperforming prior baselines. Model outputs were interpreted using the Shapley Additive Explanations (SHAP) algorithm, which identified age and hypertension/blood pressure as dominant predictors, providing both case-level insights and global feature rankings. The proposed pipeline offers practical, interpretable stroke-risk prediction with state-of-the-art performance suitable for clinical decision support.</p>\",\"PeriodicalId\":50640,\"journal\":{\"name\":\"Computer Methods in Biomechanics and Biomedical Engineering\",\"volume\":\" \",\"pages\":\"1-12\"},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2025-10-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Methods in Biomechanics and Biomedical Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1080/10255842.2025.2570510\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Methods in Biomechanics and Biomedical Engineering","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1080/10255842.2025.2570510","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
Stroke prediction using synthetic minority over-sampling technique and extreme gradient boosting.
Stroke is the world's second leading cause of death; its early prediction benefits from interpretable, high-accuracy models that can guide prevention and care. Using the Kaggle stroke dataset, we applied SMOTE to balance class distribution and trained XGBoost, Random Forest, LightGBM, CatBoost, and SVM models. XGBoost achieved 97.26% accuracy with robust 10-fold cross-validation, outperforming prior baselines. Model outputs were interpreted using the Shapley Additive Explanations (SHAP) algorithm, which identified age and hypertension/blood pressure as dominant predictors, providing both case-level insights and global feature rankings. The proposed pipeline offers practical, interpretable stroke-risk prediction with state-of-the-art performance suitable for clinical decision support.
期刊介绍:
The primary aims of Computer Methods in Biomechanics and Biomedical Engineering are to provide a means of communicating the advances being made in the areas of biomechanics and biomedical engineering and to stimulate interest in the continually emerging computer based technologies which are being applied in these multidisciplinary subjects. Computer Methods in Biomechanics and Biomedical Engineering will also provide a focus for the importance of integrating the disciplines of engineering with medical technology and clinical expertise. Such integration will have a major impact on health care in the future.