Niloy Das, Md Bipul Hossain, Apurba Adhikary, Avi Deb Raha, Yu Qiao, Md Mehedi Hassan, Anupam Kumar Bairagi
{"title":"Enlightened prognosis: Hepatitis prediction with an explainable machine learning approach.","authors":"Niloy Das, Md Bipul Hossain, Apurba Adhikary, Avi Deb Raha, Yu Qiao, Md Mehedi Hassan, Anupam Kumar Bairagi","doi":"10.1371/journal.pone.0319078","DOIUrl":null,"url":null,"abstract":"<p><p>Hepatitis is a widespread inflammatory condition of the liver, presenting a formidable global health challenge. Accurate and timely detection of hepatitis is crucial for effective patient management, yet existing methods exhibit limitations that underscore the need for innovative approaches. Early-stage detection of hepatitis is now possible with the recent adoption of machine learning and deep learning approaches. With this in mind, the study investigates the use of traditional machine learning models, specifically classifiers such as logistic regression, support vector machines (SVM), decision trees, random forest, multilayer perceptron (MLP), and other models, to predict hepatitis infections. After extensive data preprocessing including outlier detection, dataset balancing, and feature engineering, we evaluated the performance of these models. We explored three modeling approaches: machine learning with default hyperparameters, hyperparameter-tuned models using GridSearchCV, and ensemble modeling techniques. The SVM model demonstrated outstanding performance, achieving 99.25% accuracy and a perfect AUC score of 1.00 with consistency in other metrics with 99.27% precision, and 99.24% for both recall and F1-measure. The MLP and Random Forest proved to be in pace with the superior performance of SVM exhibiting an accuracy of 99.00%. To ensure robustness, we employed a 5-fold cross-validation technique. For deeper insight into model interpretability and validation, we employed an explainability analysis of our best-performed models to identify the most effective feature for hepatitis detection. Our proposed model, particularly SVM, exhibits better prediction performance regarding different performance metrics compared to existing literature.</p>","PeriodicalId":20189,"journal":{"name":"PLoS ONE","volume":"20 4","pages":"e0319078"},"PeriodicalIF":2.9000,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11964459/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS ONE","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1371/journal.pone.0319078","RegionNum":3,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
Enlightened prognosis: Hepatitis prediction with an explainable machine learning approach.
Hepatitis is a widespread inflammatory condition of the liver, presenting a formidable global health challenge. Accurate and timely detection of hepatitis is crucial for effective patient management, yet existing methods exhibit limitations that underscore the need for innovative approaches. Early-stage detection of hepatitis is now possible with the recent adoption of machine learning and deep learning approaches. With this in mind, the study investigates the use of traditional machine learning models, specifically classifiers such as logistic regression, support vector machines (SVM), decision trees, random forest, multilayer perceptron (MLP), and other models, to predict hepatitis infections. After extensive data preprocessing including outlier detection, dataset balancing, and feature engineering, we evaluated the performance of these models. We explored three modeling approaches: machine learning with default hyperparameters, hyperparameter-tuned models using GridSearchCV, and ensemble modeling techniques. The SVM model demonstrated outstanding performance, achieving 99.25% accuracy and a perfect AUC score of 1.00 with consistency in other metrics with 99.27% precision, and 99.24% for both recall and F1-measure. The MLP and Random Forest proved to be in pace with the superior performance of SVM exhibiting an accuracy of 99.00%. To ensure robustness, we employed a 5-fold cross-validation technique. For deeper insight into model interpretability and validation, we employed an explainability analysis of our best-performed models to identify the most effective feature for hepatitis detection. Our proposed model, particularly SVM, exhibits better prediction performance regarding different performance metrics compared to existing literature.
期刊介绍:
PLOS ONE is an international, peer-reviewed, open-access, online publication. PLOS ONE welcomes reports on primary research from any scientific discipline. It provides:
* Open-access—freely accessible online, authors retain copyright
* Fast publication times
* Peer review by expert, practicing researchers
* Post-publication tools to indicate quality and impact
* Community-based dialogue on articles
* Worldwide media coverage