{"title":"Prediction of Liver Cancer Based on DNA Sequence Using Ensemble Method","authors":"L. Muflikhah, N. Widodo, W. Mahmudy, Solimun","doi":"10.1109/ISRITI51436.2020.9315341","DOIUrl":null,"url":null,"abstract":"Chronic hepatitis B virus (HBV) infection is strongly associated with liver cancer. The DNA sequence of the virus is integrated into the human genome and affected the cell cycle. $HBx$ is a virus gene that is responsible to replicate for survival even though it has a high mutation rate. Machine learning methods are an effective way in biological analysis and are widely used in diagnosis to make a prediction. This study is addressed to predict liver cancer using a machine learning method based on the DNA sequence of HBV. However, unbalanced data impacts the performance evaluation of the learning method, especially for sensitivity and specificity. Therefore, this paper is proposed the ensemble method to improve the performance of prediction. We compare several classifier methods including Naive Bayes, GLM, KNN, SVM, and C5.0 Decision Tree. The results show that the ensemble method achieves a high evaluation performance value with an accuracy rate of 88.4%, a sensitivity rate of 88.4%, and a specificity rate of 91.4%.","PeriodicalId":325920,"journal":{"name":"2020 3rd International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)","volume":"157 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 3rd International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISRITI51436.2020.9315341","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Chronic hepatitis B virus (HBV) infection is strongly associated with liver cancer. The DNA sequence of the virus is integrated into the human genome and affected the cell cycle. $HBx$ is a virus gene that is responsible to replicate for survival even though it has a high mutation rate. Machine learning methods are an effective way in biological analysis and are widely used in diagnosis to make a prediction. This study is addressed to predict liver cancer using a machine learning method based on the DNA sequence of HBV. However, unbalanced data impacts the performance evaluation of the learning method, especially for sensitivity and specificity. Therefore, this paper is proposed the ensemble method to improve the performance of prediction. We compare several classifier methods including Naive Bayes, GLM, KNN, SVM, and C5.0 Decision Tree. The results show that the ensemble method achieves a high evaluation performance value with an accuracy rate of 88.4%, a sensitivity rate of 88.4%, and a specificity rate of 91.4%.