Xianzhen Huang, H. Fan, Hongjin Zhu, Xiangping Zhang
{"title":"Prediction of Heart Disease based on Enhanced Random Forest","authors":"Xianzhen Huang, H. Fan, Hongjin Zhu, Xiangping Zhang","doi":"10.1109/CCIS53392.2021.9754669","DOIUrl":null,"url":null,"abstract":"In recent years, in order to better predict heart disease, researchers have proposed algorithms such as Bayesian network, neural network, random forest, K-Means clustering and so on. For improving the prediction accuracy of the model, this paper optimizes and improves the model through the following two aspects: (1) Synthetic Minority Oversampling Technique (SMOTE) is used to deal with the uneven distribution of data sets and small number of data samples. (2) based on the complexity of samples, the similarity method is used to improve the classification accuracy of random forests. Our analysis has shown that the proposed model based on enhanced random forest has higher accuracy than the traditional method. In the prediction of heart disease, the optimized algorithm improves the accuracy of 5.96% compared with random forest.","PeriodicalId":191226,"journal":{"name":"2021 IEEE 7th International Conference on Cloud Computing and Intelligent Systems (CCIS)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 7th International Conference on Cloud Computing and Intelligent Systems (CCIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCIS53392.2021.9754669","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In recent years, in order to better predict heart disease, researchers have proposed algorithms such as Bayesian network, neural network, random forest, K-Means clustering and so on. For improving the prediction accuracy of the model, this paper optimizes and improves the model through the following two aspects: (1) Synthetic Minority Oversampling Technique (SMOTE) is used to deal with the uneven distribution of data sets and small number of data samples. (2) based on the complexity of samples, the similarity method is used to improve the classification accuracy of random forests. Our analysis has shown that the proposed model based on enhanced random forest has higher accuracy than the traditional method. In the prediction of heart disease, the optimized algorithm improves the accuracy of 5.96% compared with random forest.