{"title":"利用数据挖掘方法检测心血管疾病:基于集合的模型的应用","authors":"Mojdeh Nazari, Hassan Emami, Reza Rabiei, Azamossadat Hosseini, Shahabedin Rahmatizadeh","doi":"10.1007/s12559-024-10306-z","DOIUrl":null,"url":null,"abstract":"<p>Cardiovascular diseases are the leading contributor of mortality worldwide. Accurate cardiovascular disease prediction is crucial, and the application of machine learning and data mining techniques could facilitate decision-making and improve predictive capabilities. This study aimed to present a model for accurate prediction of cardiovascular diseases and identifying key contributing factors with the greatest impact. The Cleveland dataset besides the locally collected dataset, called the Noor dataset, was used in this study. Accordingly, various data mining techniques besides four ensemble learning-based models were implemented on both datasets. Moreover, a novel model for combining individual classifiers in ensemble learning, wherein weights were assigned to each classifier (using a genetic algorithm), was developed. The predictive strength of each feature was also investigated to ensure the generalizability of the outcomes. The ultimate ensemble-based model achieved a precision rate of 88.05% and 90.12% on the Cleveland and Noor datasets, respectively, demonstrating its reliability and suitability for future research in predicting the likelihood of cardiovascular diseases. Not only the proposed model introduces an innovative approach for specifying cardiovascular diseases by unraveling the intricate relationships between various biological variables but also facilitates early detection of cardiovascular diseases.</p>","PeriodicalId":51243,"journal":{"name":"Cognitive Computation","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Detection of Cardiovascular Diseases Using Data Mining Approaches: Application of an Ensemble-Based Model\",\"authors\":\"Mojdeh Nazari, Hassan Emami, Reza Rabiei, Azamossadat Hosseini, Shahabedin Rahmatizadeh\",\"doi\":\"10.1007/s12559-024-10306-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Cardiovascular diseases are the leading contributor of mortality worldwide. Accurate cardiovascular disease prediction is crucial, and the application of machine learning and data mining techniques could facilitate decision-making and improve predictive capabilities. This study aimed to present a model for accurate prediction of cardiovascular diseases and identifying key contributing factors with the greatest impact. The Cleveland dataset besides the locally collected dataset, called the Noor dataset, was used in this study. Accordingly, various data mining techniques besides four ensemble learning-based models were implemented on both datasets. Moreover, a novel model for combining individual classifiers in ensemble learning, wherein weights were assigned to each classifier (using a genetic algorithm), was developed. The predictive strength of each feature was also investigated to ensure the generalizability of the outcomes. The ultimate ensemble-based model achieved a precision rate of 88.05% and 90.12% on the Cleveland and Noor datasets, respectively, demonstrating its reliability and suitability for future research in predicting the likelihood of cardiovascular diseases. Not only the proposed model introduces an innovative approach for specifying cardiovascular diseases by unraveling the intricate relationships between various biological variables but also facilitates early detection of cardiovascular diseases.</p>\",\"PeriodicalId\":51243,\"journal\":{\"name\":\"Cognitive Computation\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-05-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cognitive Computation\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s12559-024-10306-z\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognitive Computation","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s12559-024-10306-z","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
摘要
心血管疾病是导致全球死亡的主要因素。准确预测心血管疾病至关重要,而应用机器学习和数据挖掘技术可以促进决策并提高预测能力。本研究旨在提出一个模型,用于准确预测心血管疾病,并确定影响最大的关键诱因。除本地收集的数据集(称为 Noor 数据集)外,本研究还使用了克利夫兰数据集。因此,除了四个基于集合学习的模型外,还在这两个数据集上实施了各种数据挖掘技术。此外,还开发了一种在集合学习中组合单个分类器的新模型,其中为每个分类器分配了权重(使用遗传算法)。此外,还对每个特征的预测强度进行了研究,以确保结果的通用性。最终基于集合的模型在克利夫兰和努尔数据集上的精确率分别达到了 88.05% 和 90.12%,证明了其可靠性以及在未来预测心血管疾病可能性研究中的适用性。所提出的模型不仅通过揭示各种生物变量之间错综复杂的关系,为心血管疾病的诊断引入了一种创新方法,而且有助于心血管疾病的早期检测。
Detection of Cardiovascular Diseases Using Data Mining Approaches: Application of an Ensemble-Based Model
Cardiovascular diseases are the leading contributor of mortality worldwide. Accurate cardiovascular disease prediction is crucial, and the application of machine learning and data mining techniques could facilitate decision-making and improve predictive capabilities. This study aimed to present a model for accurate prediction of cardiovascular diseases and identifying key contributing factors with the greatest impact. The Cleveland dataset besides the locally collected dataset, called the Noor dataset, was used in this study. Accordingly, various data mining techniques besides four ensemble learning-based models were implemented on both datasets. Moreover, a novel model for combining individual classifiers in ensemble learning, wherein weights were assigned to each classifier (using a genetic algorithm), was developed. The predictive strength of each feature was also investigated to ensure the generalizability of the outcomes. The ultimate ensemble-based model achieved a precision rate of 88.05% and 90.12% on the Cleveland and Noor datasets, respectively, demonstrating its reliability and suitability for future research in predicting the likelihood of cardiovascular diseases. Not only the proposed model introduces an innovative approach for specifying cardiovascular diseases by unraveling the intricate relationships between various biological variables but also facilitates early detection of cardiovascular diseases.
期刊介绍:
Cognitive Computation is an international, peer-reviewed, interdisciplinary journal that publishes cutting-edge articles describing original basic and applied work involving biologically-inspired computational accounts of all aspects of natural and artificial cognitive systems. It provides a new platform for the dissemination of research, current practices and future trends in the emerging discipline of cognitive computation that bridges the gap between life sciences, social sciences, engineering, physical and mathematical sciences, and humanities.