{"title":"Prediction of Hepatitis Disease Using K-Nearest Neighbors, Naive Bayes, Support Vector Machine, Multi-Layer Perceptron and Random Forest","authors":"M. Nayeem, Sohel Rana, Farjana Alam, M. Rahman","doi":"10.1109/ICICT4SD50815.2021.9397013","DOIUrl":null,"url":null,"abstract":"At present, Hepatitis is one of the serious types of disease which causes death around the world. It is responsiblefor inflammation in the human liver. If we can succeed to detect this deadly disease early, we can save many people's lives from this disease. In this research paper, we have predicted hepatitis disease by using different data mining techniques. Besides this, we have proposed a decent way by which we can improve the performanceof our prediction models. We have handled missing values present in our dataset by removing the observations having missing values. We have found out the unnecessary features by using info-gain feature selection procedure with ranker search. The classification techniques such that K-Nearest Neighbors (KNN), Naive Bayes Support Vector Machine (SVM), Multi-Layer Perceptron (MLP) and Random Forest are applied on the hepatitis disease dataset in order to calculate prediction accuracy. We have measured accuracy, precision, recall, F1-score and ROC whose help us to compare the performance of the classification models. Removing the observations having missing values as well as the info-gain feature selection technique has helped us to improve the accuracy of our prediction models. We have got best performance from Random Forest whose classification accuracy is 92.41%.","PeriodicalId":239251,"journal":{"name":"2021 International Conference on Information and Communication Technology for Sustainable Development (ICICT4SD)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Information and Communication Technology for Sustainable Development (ICICT4SD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICT4SD50815.2021.9397013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
At present, Hepatitis is one of the serious types of disease which causes death around the world. It is responsiblefor inflammation in the human liver. If we can succeed to detect this deadly disease early, we can save many people's lives from this disease. In this research paper, we have predicted hepatitis disease by using different data mining techniques. Besides this, we have proposed a decent way by which we can improve the performanceof our prediction models. We have handled missing values present in our dataset by removing the observations having missing values. We have found out the unnecessary features by using info-gain feature selection procedure with ranker search. The classification techniques such that K-Nearest Neighbors (KNN), Naive Bayes Support Vector Machine (SVM), Multi-Layer Perceptron (MLP) and Random Forest are applied on the hepatitis disease dataset in order to calculate prediction accuracy. We have measured accuracy, precision, recall, F1-score and ROC whose help us to compare the performance of the classification models. Removing the observations having missing values as well as the info-gain feature selection technique has helped us to improve the accuracy of our prediction models. We have got best performance from Random Forest whose classification accuracy is 92.41%.