Foad Ghasemi, Behzad Soleimani Neysiani, N. Nematbakhsh
{"title":"基于信息增益比和基尼指数的心脏冠状动脉疾病预诊断特征选择","authors":"Foad Ghasemi, Behzad Soleimani Neysiani, N. Nematbakhsh","doi":"10.1109/ICWR49608.2020.9122285","DOIUrl":null,"url":null,"abstract":"Cardiovascular disease is one of the most common causes of mortality in the world. Among the different types of this disease, the coronary artery is the most important, which the correct and timely diagnosis of which is vital. Diagnostic and treatment methods of this disease have many side effects and costs. The best and most accurate diagnostic method here is angiography. Researchers seek to find economical and high-accuracy methods for this purpose. The disease-related features and different data mining techniques are described to increase the accuracy of the diagnosis through one dataset of essential and useful features. Data are collected from 303 suspected cardiovascular patients in Shahid Rajaee Hospital, Tehran. Among the samples, 87 are healthy, and 216 are sick. The features are selected through their optimal subsets of performance, speed of diagnosis, and precision in the first step to determine the severity of coronary artery disease (CAD). This feature selection can predict and promote a learning model. Then the optimal machine learning models are applied to analyze and predict CAD. The accuracy of 99.67% is found in this diagnosis, indicating the highest obtained accuracy in this field. The left anterior descending (LAD), the left circumflex (LCX), and the right coronary artery (RCA) features are diagnosed with high accuracy by using those models. It seems these three features define the CAD and are dependent on angiography. If they are eliminated for the prediagnosis situation, the accuracy of CAD will be between 83% to 86% for the new reduced subset of features proposed concerning legible performance reduction.","PeriodicalId":231982,"journal":{"name":"2020 6th International Conference on Web Research (ICWR)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Feature Selection in Pre-Diagnosis Heart Coronary Artery Disease Detection: A heuristic approach for feature selection based on Information Gain Ratio and Gini Index\",\"authors\":\"Foad Ghasemi, Behzad Soleimani Neysiani, N. Nematbakhsh\",\"doi\":\"10.1109/ICWR49608.2020.9122285\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cardiovascular disease is one of the most common causes of mortality in the world. Among the different types of this disease, the coronary artery is the most important, which the correct and timely diagnosis of which is vital. Diagnostic and treatment methods of this disease have many side effects and costs. The best and most accurate diagnostic method here is angiography. Researchers seek to find economical and high-accuracy methods for this purpose. The disease-related features and different data mining techniques are described to increase the accuracy of the diagnosis through one dataset of essential and useful features. Data are collected from 303 suspected cardiovascular patients in Shahid Rajaee Hospital, Tehran. Among the samples, 87 are healthy, and 216 are sick. The features are selected through their optimal subsets of performance, speed of diagnosis, and precision in the first step to determine the severity of coronary artery disease (CAD). This feature selection can predict and promote a learning model. Then the optimal machine learning models are applied to analyze and predict CAD. The accuracy of 99.67% is found in this diagnosis, indicating the highest obtained accuracy in this field. The left anterior descending (LAD), the left circumflex (LCX), and the right coronary artery (RCA) features are diagnosed with high accuracy by using those models. It seems these three features define the CAD and are dependent on angiography. If they are eliminated for the prediagnosis situation, the accuracy of CAD will be between 83% to 86% for the new reduced subset of features proposed concerning legible performance reduction.\",\"PeriodicalId\":231982,\"journal\":{\"name\":\"2020 6th International Conference on Web Research (ICWR)\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 6th International Conference on Web Research (ICWR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICWR49608.2020.9122285\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 6th International Conference on Web Research (ICWR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICWR49608.2020.9122285","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Feature Selection in Pre-Diagnosis Heart Coronary Artery Disease Detection: A heuristic approach for feature selection based on Information Gain Ratio and Gini Index
Cardiovascular disease is one of the most common causes of mortality in the world. Among the different types of this disease, the coronary artery is the most important, which the correct and timely diagnosis of which is vital. Diagnostic and treatment methods of this disease have many side effects and costs. The best and most accurate diagnostic method here is angiography. Researchers seek to find economical and high-accuracy methods for this purpose. The disease-related features and different data mining techniques are described to increase the accuracy of the diagnosis through one dataset of essential and useful features. Data are collected from 303 suspected cardiovascular patients in Shahid Rajaee Hospital, Tehran. Among the samples, 87 are healthy, and 216 are sick. The features are selected through their optimal subsets of performance, speed of diagnosis, and precision in the first step to determine the severity of coronary artery disease (CAD). This feature selection can predict and promote a learning model. Then the optimal machine learning models are applied to analyze and predict CAD. The accuracy of 99.67% is found in this diagnosis, indicating the highest obtained accuracy in this field. The left anterior descending (LAD), the left circumflex (LCX), and the right coronary artery (RCA) features are diagnosed with high accuracy by using those models. It seems these three features define the CAD and are dependent on angiography. If they are eliminated for the prediagnosis situation, the accuracy of CAD will be between 83% to 86% for the new reduced subset of features proposed concerning legible performance reduction.