基于极端梯度增强的可解释机器学习模型预测自身免疫性肝炎显著纤维化。

IF 6.4 4区医学 Q1 MEDICINE, GENERAL & INTERNAL

QJM: An International Journal of Medicine Pub Date : 2025-09-18 DOI:10.1093/qjmed/hcaf215

Zhiyi Zhang, Jing Wu, Jian Wang, Yun Chen, Renling Yao, Li Zhu, Yiguang Li, Shaoqiu Zhang, Yifan Pan, Fei Cao, Yuanyuan Li, Jiacheng Liu, Yuxin Chen, Shengxia Yin, Xin Tong, Qun Zhang, Xinrong Zhang, Yuanwang Qiu, Chuanwu Zhu, Huali Wang, Chao Wu, Rui Huang

{"title":"基于极端梯度增强的可解释机器学习模型预测自身免疫性肝炎显著纤维化。","authors":"Zhiyi Zhang, Jing Wu, Jian Wang, Yun Chen, Renling Yao, Li Zhu, Yiguang Li, Shaoqiu Zhang, Yifan Pan, Fei Cao, Yuanyuan Li, Jiacheng Liu, Yuxin Chen, Shengxia Yin, Xin Tong, Qun Zhang, Xinrong Zhang, Yuanwang Qiu, Chuanwu Zhu, Huali Wang, Chao Wu, Rui Huang","doi":"10.1093/qjmed/hcaf215","DOIUrl":null,"url":null,"abstract":"Background: Accurate assessment of liver fibrosis is crucial for patients with autoimmune hepatitis (AIH).Aim: We developed and validated a non-invasive explainable machine-learning model for the prediction of liver fibrosis in patients with AIH.Design: A retrospective multi-center study of AIH patients with liver biopsy was conducted.Methods: Patients were randomly divided into a training set and a test set. Nine machine learning (ML) models were built, including logistic regression, k-nearest neighbors, Support vector machine, random forest, extreme gradient boosting (XGBoost), gradient boosting, Adaboost, decision tree, and Gaussian naive bayes. The best model was compared with aminotransferase to platelet ratio index (APRI) and fibrosis index based on 4 factors (FIB-4) on the test set by area under receiver operating characteristic curves (AUC). SHapley Additive exPlanation (SHAP) analysis and local interpretable model-agnostic explanations (LIME) were used for model explanation.Results: A total of 261 AIH patients with a median age of 54.0 (IQR: 47.0, 62.0) years and 82.8% of female sex were included. Among nine ML models, the XGBoost model exhibited superior predictive performance. The model achieved an AUC of 0.791 (95% confidence interval [CI]: 0.668-0.890) in the test set which was higher than APRI (AUC: 0.557, 95% CI: 0.380-0.732, P < 0.001) and FIB-4 (AUC: 0.625, 95% CI: 0.452-0.789, P < 0.001). SHAP and LIME analysis revealed that platelet was the most important predictive variable of significant liver fibrosis.Conclusions: The non-invasive interpretable XGBoost model surpasses APRI and FIB-4 for predicting significant liver fibrosis, contributing to better management of AIH patients.","PeriodicalId":20806,"journal":{"name":"QJM: An International Journal of Medicine","volume":" ","pages":""},"PeriodicalIF":6.4000,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Extreme Gradient Boosting-based Explainable Machine Learning Model for Predicting Significant Fibrosis in Autoimmune Hepatitis.\",\"authors\":\"Zhiyi Zhang, Jing Wu, Jian Wang, Yun Chen, Renling Yao, Li Zhu, Yiguang Li, Shaoqiu Zhang, Yifan Pan, Fei Cao, Yuanyuan Li, Jiacheng Liu, Yuxin Chen, Shengxia Yin, Xin Tong, Qun Zhang, Xinrong Zhang, Yuanwang Qiu, Chuanwu Zhu, Huali Wang, Chao Wu, Rui Huang\",\"doi\":\"10.1093/qjmed/hcaf215\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Accurate assessment of liver fibrosis is crucial for patients with autoimmune hepatitis (AIH).Aim: We developed and validated a non-invasive explainable machine-learning model for the prediction of liver fibrosis in patients with AIH.Design: A retrospective multi-center study of AIH patients with liver biopsy was conducted.Methods: Patients were randomly divided into a training set and a test set. Nine machine learning (ML) models were built, including logistic regression, k-nearest neighbors, Support vector machine, random forest, extreme gradient boosting (XGBoost), gradient boosting, Adaboost, decision tree, and Gaussian naive bayes. The best model was compared with aminotransferase to platelet ratio index (APRI) and fibrosis index based on 4 factors (FIB-4) on the test set by area under receiver operating characteristic curves (AUC). SHapley Additive exPlanation (SHAP) analysis and local interpretable model-agnostic explanations (LIME) were used for model explanation.Results: A total of 261 AIH patients with a median age of 54.0 (IQR: 47.0, 62.0) years and 82.8% of female sex were included. Among nine ML models, the XGBoost model exhibited superior predictive performance. The model achieved an AUC of 0.791 (95% confidence interval [CI]: 0.668-0.890) in the test set which was higher than APRI (AUC: 0.557, 95% CI: 0.380-0.732, P < 0.001) and FIB-4 (AUC: 0.625, 95% CI: 0.452-0.789, P < 0.001). SHAP and LIME analysis revealed that platelet was the most important predictive variable of significant liver fibrosis.Conclusions: The non-invasive interpretable XGBoost model surpasses APRI and FIB-4 for predicting significant liver fibrosis, contributing to better management of AIH patients.\",\"PeriodicalId\":20806,\"journal\":{\"name\":\"QJM: An International Journal of Medicine\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":6.4000,\"publicationDate\":\"2025-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"QJM: An International Journal of Medicine\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1093/qjmed/hcaf215\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MEDICINE, GENERAL & INTERNAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"QJM: An International Journal of Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/qjmed/hcaf215","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}

引用次数: 0

摘要

背景：准确评估肝纤维化对自身免疫性肝炎（AIH）患者至关重要。目的：我们开发并验证了一种非侵入性可解释的机器学习模型，用于预测AIH患者的肝纤维化。设计：对肝活检的AIH患者进行回顾性多中心研究。方法：将患者随机分为训练集和测试集。建立了9个机器学习模型，包括逻辑回归、k近邻、支持向量机、随机森林、极端梯度增强（XGBoost）、梯度增强、Adaboost、决策树和高斯朴素贝叶斯。以受试者工作特征曲线下面积（AUC）与试验组的转氨酶血小板比值指数（APRI）和基于4因素的纤维化指数（FIB-4）进行比较。模型解释采用SHapley加性解释（SHAP）和局部可解释模型不可知论解释（LIME）。结果：共纳入AIH患者261例，中位年龄54.0岁（IQR: 47.0, 62.0），女性占82.8%。在9个ML模型中，XGBoost模型表现出较好的预测性能。该模型在测试集中的AUC为0.791(95%置信区间[CI]: 0.668-0.890)，高于APRI （AUC: 0.557, 95% CI: 0.380-0.732， P）。结论：无创可解释性XGBoost模型在预测显著性肝纤维化方面优于APRI和FIB-4，有助于更好地管理AIH患者。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Extreme Gradient Boosting-based Explainable Machine Learning Model for Predicting Significant Fibrosis in Autoimmune Hepatitis.

Background: Accurate assessment of liver fibrosis is crucial for patients with autoimmune hepatitis (AIH).

Aim: We developed and validated a non-invasive explainable machine-learning model for the prediction of liver fibrosis in patients with AIH.

Design: A retrospective multi-center study of AIH patients with liver biopsy was conducted.

Methods: Patients were randomly divided into a training set and a test set. Nine machine learning (ML) models were built, including logistic regression, k-nearest neighbors, Support vector machine, random forest, extreme gradient boosting (XGBoost), gradient boosting, Adaboost, decision tree, and Gaussian naive bayes. The best model was compared with aminotransferase to platelet ratio index (APRI) and fibrosis index based on 4 factors (FIB-4) on the test set by area under receiver operating characteristic curves (AUC). SHapley Additive exPlanation (SHAP) analysis and local interpretable model-agnostic explanations (LIME) were used for model explanation.

Results: A total of 261 AIH patients with a median age of 54.0 (IQR: 47.0, 62.0) years and 82.8% of female sex were included. Among nine ML models, the XGBoost model exhibited superior predictive performance. The model achieved an AUC of 0.791 (95% confidence interval [CI]: 0.668-0.890) in the test set which was higher than APRI (AUC: 0.557, 95% CI: 0.380-0.732, P < 0.001) and FIB-4 (AUC: 0.625, 95% CI: 0.452-0.789, P < 0.001). SHAP and LIME analysis revealed that platelet was the most important predictive variable of significant liver fibrosis.

Conclusions: The non-invasive interpretable XGBoost model surpasses APRI and FIB-4 for predicting significant liver fibrosis, contributing to better management of AIH patients.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

QJM: An International Journal of Medicine 医学-医学：内科

CiteScore

6.90

自引率

5.30%

发文量

263

审稿时长

4-8 weeks

期刊介绍： QJM, a renowned and reputable general medical journal, has been a prominent source of knowledge in the field of internal medicine. With a steadfast commitment to advancing medical science and practice, it features a selection of rigorously reviewed articles. Released on a monthly basis, QJM encompasses a wide range of article types. These include original papers that contribute innovative research, editorials that offer expert opinions, and reviews that provide comprehensive analyses of specific topics. The journal also presents commentary papers aimed at initiating discussions on controversial subjects and allocates a dedicated section for reader correspondence. In summary, QJM's reputable standing stems from its enduring presence in the medical community, consistent publication schedule, and diverse range of content designed to inform and engage readers.