Fei-Xiang Xiong, Lei Sun, Xue-Jie Zhang, Jia-Liang Chen, Yang Zhou, Xiao-Min Ji, Pei-Pei Meng, Tong Wu, Xian-Bo Wang, Yi-Xin Hou
{"title":"Machine learning-based models for advanced fibrosis in non-alcoholic steatohepatitis patients: A cohort study.","authors":"Fei-Xiang Xiong, Lei Sun, Xue-Jie Zhang, Jia-Liang Chen, Yang Zhou, Xiao-Min Ji, Pei-Pei Meng, Tong Wu, Xian-Bo Wang, Yi-Xin Hou","doi":"10.3748/wjg.v31.i9.101383","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The global prevalence of non-alcoholic steatohepatitis (NASH) and its associated risk of adverse outcomes, particularly in patients with advanced liver fibrosis, underscores the importance of early and accurate diagnosis.</p><p><strong>Aim: </strong>To develop a machine learning-based diagnostic model for advanced liver fibrosis in NASH patients.</p><p><strong>Methods: </strong>A total of 749 patients who underwent liver biopsy at Beijing Ditan Hospital, Capital Medical University, between January 2010 and January 2020 were included. Patients were randomly divided into training (<i>n</i> = 522) and validation (<i>n</i> = 224) cohorts. Five machine learning models were applied to predict advanced liver fibrosis, with feature selection based on Shapley Additive Explanations (SHAP). The diagnostic performance of these models was compared to traditional scores such as the aspartate aminotransferase to platelet ratio index (APRI) and fibrosis index based on the 4 factors (FIB-4), using metrics including the area under the receiver operating characteristic curve (AUROC), decision curve analysis (DCA), and calibration curves.</p><p><strong>Results: </strong>The Extreme Gradient Boosting (XGBoost) model outperformed all other machine learning models, achieving an AUROC of 0.934 (95%CI: 0.914-0.955) in the training cohort and 0.917 (95%CI: 0.880-0.953) in the validation cohort (<i>P</i> < 0.001). Incorporating liver stiffness measurement into the model further improved its performance, with an AUROC of 0.977 (95%CI: 0.966-0.980) in the training cohort and 0.970 (95%CI: 0.950-0.990) in the validation cohort, significantly surpassing APRI and FIB-4 scores (<i>P</i> < 0.001). The XGBoost model also demonstrated superior clinical utility, as evidenced by DCA and calibration curve analysis in both cohorts.</p><p><strong>Conclusion: </strong>The XGBoost model provides a highly accurate, non-invasive diagnosis of advanced liver fibrosis in NASH patients, outperforming traditional methods. An online tool based on this model has been developed to assist clinicians in evaluating the risk of advanced liver fibrosis.</p>","PeriodicalId":23778,"journal":{"name":"World Journal of Gastroenterology","volume":"31 9","pages":"101383"},"PeriodicalIF":4.3000,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11886044/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"World Journal of Gastroenterology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3748/wjg.v31.i9.101383","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: The global prevalence of non-alcoholic steatohepatitis (NASH) and its associated risk of adverse outcomes, particularly in patients with advanced liver fibrosis, underscores the importance of early and accurate diagnosis.
Aim: To develop a machine learning-based diagnostic model for advanced liver fibrosis in NASH patients.
Methods: A total of 749 patients who underwent liver biopsy at Beijing Ditan Hospital, Capital Medical University, between January 2010 and January 2020 were included. Patients were randomly divided into training (n = 522) and validation (n = 224) cohorts. Five machine learning models were applied to predict advanced liver fibrosis, with feature selection based on Shapley Additive Explanations (SHAP). The diagnostic performance of these models was compared to traditional scores such as the aspartate aminotransferase to platelet ratio index (APRI) and fibrosis index based on the 4 factors (FIB-4), using metrics including the area under the receiver operating characteristic curve (AUROC), decision curve analysis (DCA), and calibration curves.
Results: The Extreme Gradient Boosting (XGBoost) model outperformed all other machine learning models, achieving an AUROC of 0.934 (95%CI: 0.914-0.955) in the training cohort and 0.917 (95%CI: 0.880-0.953) in the validation cohort (P < 0.001). Incorporating liver stiffness measurement into the model further improved its performance, with an AUROC of 0.977 (95%CI: 0.966-0.980) in the training cohort and 0.970 (95%CI: 0.950-0.990) in the validation cohort, significantly surpassing APRI and FIB-4 scores (P < 0.001). The XGBoost model also demonstrated superior clinical utility, as evidenced by DCA and calibration curve analysis in both cohorts.
Conclusion: The XGBoost model provides a highly accurate, non-invasive diagnosis of advanced liver fibrosis in NASH patients, outperforming traditional methods. An online tool based on this model has been developed to assist clinicians in evaluating the risk of advanced liver fibrosis.
期刊介绍:
The primary aims of the WJG are to improve diagnostic, therapeutic and preventive modalities and the skills of clinicians and to guide clinical practice in gastroenterology and hepatology.