B. Cherradi, O. Terrada, Asmae Ouhmida, S. Hamida, A. Raihani, O. Bouattane
{"title":"Computer-Aided Diagnosis System for Early Prediction of Atherosclerosis using Machine Learning and K-fold cross-validation","authors":"B. Cherradi, O. Terrada, Asmae Ouhmida, S. Hamida, A. Raihani, O. Bouattane","doi":"10.1109/ICOTEN52080.2021.9493524","DOIUrl":null,"url":null,"abstract":"Atherosclerosis known as coronary artery disease (CAD) becomes epidemic in any society that relies on an industrial-technological system with an associated behavioral alteration in people's lifestyles as junk food consumerism and stressful habits. However, this disease residue the first cause of death in industrialized countries, despite many new therapeutic approaches and risk factors prevention. Moreover, atherosclerosis misdiagnosis has side costly effects. In this paper, we have proposed a computer-aided diagnosis system based on K-Nearest Neighbors (KNN) and Artificial Neural Network (ANN) algorithms. Then, we applied K-fold cross-validation in order to split the databases and reach the best model with the higher accuracy and fewer side effects. In this proposed work, we tested the reached model on 573 patients with several effective features which collecting from Cleveland and Z-Alizadeh Sani datasets. Then Area Under the Curve (AUC), F1-Score, and accuracy were used to enrich and determine the effectiveness of each predictive model. Using Machine Learning (ML) methods, K-fold cross-validation, and performance evaluation metrics, 96.78% average accuracy is achieved with the original training accuracy of 100%, which means the prediction system is obtained as the best predictive model comparing to the previous studies.","PeriodicalId":308802,"journal":{"name":"2021 International Congress of Advanced Technology and Engineering (ICOTEN)","volume":"102 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Congress of Advanced Technology and Engineering (ICOTEN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOTEN52080.2021.9493524","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18
Abstract
Atherosclerosis known as coronary artery disease (CAD) becomes epidemic in any society that relies on an industrial-technological system with an associated behavioral alteration in people's lifestyles as junk food consumerism and stressful habits. However, this disease residue the first cause of death in industrialized countries, despite many new therapeutic approaches and risk factors prevention. Moreover, atherosclerosis misdiagnosis has side costly effects. In this paper, we have proposed a computer-aided diagnosis system based on K-Nearest Neighbors (KNN) and Artificial Neural Network (ANN) algorithms. Then, we applied K-fold cross-validation in order to split the databases and reach the best model with the higher accuracy and fewer side effects. In this proposed work, we tested the reached model on 573 patients with several effective features which collecting from Cleveland and Z-Alizadeh Sani datasets. Then Area Under the Curve (AUC), F1-Score, and accuracy were used to enrich and determine the effectiveness of each predictive model. Using Machine Learning (ML) methods, K-fold cross-validation, and performance evaluation metrics, 96.78% average accuracy is achieved with the original training accuracy of 100%, which means the prediction system is obtained as the best predictive model comparing to the previous studies.
动脉粥样硬化,即冠状动脉疾病(CAD),在任何依赖工业技术体系的社会中都会成为流行病,并与垃圾食品、消费主义和压力习惯等人们生活方式的行为改变有关。然而,尽管有许多新的治疗方法和危险因素预防,这种疾病仍然是工业化国家的首要死亡原因。此外,动脉粥样硬化的误诊有副作用。本文提出了一种基于k近邻(KNN)和人工神经网络(ANN)的计算机辅助诊断系统。然后,我们应用K-fold交叉验证来分割数据库,得到精度更高、副作用更少的最佳模型。在本工作中,我们对573名患者的模型进行了测试,这些患者具有从Cleveland和Z-Alizadeh Sani数据集收集的几个有效特征。然后利用曲线下面积(Area Under the Curve, AUC)、F1-Score和准确率来丰富和确定各预测模型的有效性。使用机器学习(ML)方法、K-fold交叉验证和性能评估指标,平均准确率达到96.78%,原始训练准确率为100%,与以往研究相比,预测系统是最好的预测模型。