Development of machine learning-based personalized predictive models for risk evaluation of hepatocellular carcinoma in hepatitis B virus-related cirrhosis patients with low levels of serum alpha-fetoprotein

IF 3.7 3区 医学 Q2 GASTROENTEROLOGY & HEPATOLOGY
{"title":"Development of machine learning-based personalized predictive models for risk evaluation of hepatocellular carcinoma in hepatitis B virus-related cirrhosis patients with low levels of serum alpha-fetoprotein","authors":"","doi":"10.1016/j.aohep.2024.101540","DOIUrl":null,"url":null,"abstract":"<div><h3>Introduction and Objectives</h3><p>The increasing incidence of hepatocellular carcinoma (HCC) in China is an urgent issue, necessitating early diagnosis and treatment. This study aimed to develop personalized predictive models by combining machine learning (ML) technology with a demographic, medical history, and noninvasive biomarker data. These models can enhance the decision-making capabilities of physicians for HCC in hepatitis B virus (HBV)-related cirrhosis patients with low serum alpha-fetoprotein (AFP) levels.</p></div><div><h3>Patients and Methods</h3><p>A total of 6,980 patients treated between January 2012 and December 2018 were included. Pre-treatment laboratory tests and clinical data were obtained. The significant risk factors for HCC were identified, and the relative risk of each variable affecting its diagnosis was calculated using ML and univariate regression analysis. The data set was then randomly partitioned into validation (20 %) and training sets (80 %) to develop the ML models.</p></div><div><h3>Results</h3><p>Twelve independent risk factors for HCC were identified using Gaussian naïve Bayes, extreme gradient boosting (XGBoost), random forest, and least absolute shrinkage and selection operation regression models. Multivariate analysis revealed that male sex, age &gt;60 years, alkaline phosphate &gt;150 U/L, AFP &gt;25 ng/mL, carcinoembryonic antigen &gt;5 ng/mL, and fibrinogen &gt;4 g/L were the risk factors, whereas hypertension, calcium &lt;2.25 mmol/L, potassium ≤3.5 mmol/L, direct bilirubin &gt;6.8 μmol/L, hemoglobin &lt;110 g/L, and glutamic-pyruvic transaminase &gt;40 U/L were the protective factors in HCC patients. Based on these factors, a nomogram was constructed, showing an area under the curve (AUC) of 0.746 (sensitivity = 0.710, specificity=0.646), which was significantly higher than AFP AUC of 0.658 (sensitivity = 0.462, specificity=0.766). Compared with several ML algorithms, the XGBoost model had an AUC of 0.832 (sensitivity = 0.745, specificity=0.766) and an independent validation AUC of 0.829 (sensitivity = 0.766, specificity = 0.737), making it the top-performing model in both sets. The external validation results have proven the accuracy of the XGBoost model.</p></div><div><h3>Conclusions</h3><p>The proposed XGBoost demonstrated a promising ability for individualized prediction of HCC in HBV-related cirrhosis patients with low-level AFP.</p></div>","PeriodicalId":7979,"journal":{"name":"Annals of hepatology","volume":null,"pages":null},"PeriodicalIF":3.7000,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S166526812400334X/pdfft?md5=82d35e2d0a0e066c7691927642b98ea4&pid=1-s2.0-S166526812400334X-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of hepatology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S166526812400334X","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction and Objectives

The increasing incidence of hepatocellular carcinoma (HCC) in China is an urgent issue, necessitating early diagnosis and treatment. This study aimed to develop personalized predictive models by combining machine learning (ML) technology with a demographic, medical history, and noninvasive biomarker data. These models can enhance the decision-making capabilities of physicians for HCC in hepatitis B virus (HBV)-related cirrhosis patients with low serum alpha-fetoprotein (AFP) levels.

Patients and Methods

A total of 6,980 patients treated between January 2012 and December 2018 were included. Pre-treatment laboratory tests and clinical data were obtained. The significant risk factors for HCC were identified, and the relative risk of each variable affecting its diagnosis was calculated using ML and univariate regression analysis. The data set was then randomly partitioned into validation (20 %) and training sets (80 %) to develop the ML models.

Results

Twelve independent risk factors for HCC were identified using Gaussian naïve Bayes, extreme gradient boosting (XGBoost), random forest, and least absolute shrinkage and selection operation regression models. Multivariate analysis revealed that male sex, age >60 years, alkaline phosphate >150 U/L, AFP >25 ng/mL, carcinoembryonic antigen >5 ng/mL, and fibrinogen >4 g/L were the risk factors, whereas hypertension, calcium <2.25 mmol/L, potassium ≤3.5 mmol/L, direct bilirubin >6.8 μmol/L, hemoglobin <110 g/L, and glutamic-pyruvic transaminase >40 U/L were the protective factors in HCC patients. Based on these factors, a nomogram was constructed, showing an area under the curve (AUC) of 0.746 (sensitivity = 0.710, specificity=0.646), which was significantly higher than AFP AUC of 0.658 (sensitivity = 0.462, specificity=0.766). Compared with several ML algorithms, the XGBoost model had an AUC of 0.832 (sensitivity = 0.745, specificity=0.766) and an independent validation AUC of 0.829 (sensitivity = 0.766, specificity = 0.737), making it the top-performing model in both sets. The external validation results have proven the accuracy of the XGBoost model.

Conclusions

The proposed XGBoost demonstrated a promising ability for individualized prediction of HCC in HBV-related cirrhosis patients with low-level AFP.

开发基于机器学习的个性化预测模型,用于评估血清甲胎蛋白水平较低的乙肝病毒相关肝硬化患者患肝细胞癌的风险。
引言和目的:肝细胞癌(HCC)在中国的发病率不断上升,需要早期诊断和治疗,这是一个紧迫的问题。本研究旨在通过将机器学习(ML)技术与人口统计学、病史和无创生物标志物数据相结合,开发个性化预测模型。这些模型可提高医生对血清甲胎蛋白(AFP)水平较低的乙型肝炎病毒(HBV)相关肝硬化患者的 HCC 的决策能力:共纳入2012年1月至2018年12月期间接受治疗的6980名患者。获得了治疗前的实验室检查和临床数据。确定了HCC的重要风险因素,并使用ML和单变量回归分析计算了每个变量影响其诊断的相对风险。然后将数据集随机分为验证集(20%)和训练集(80%),以建立 ML 模型:结果:利用高斯天真贝叶斯、极梯度提升(XGBoost)、随机森林、最小绝对收缩和选择操作回归模型,确定了12个独立的HCC风险因素。多变量分析显示,男性、年龄大于 60 岁、碱性磷酸酶大于 150 U/L、甲胎蛋白大于 25 ng/mL、癌胚抗原大于 5 ng/mL、纤维蛋白原大于 4 g/L 是 HCC 患者的危险因素,而高血压、血钙 6.8 μmol/L、血红蛋白 40 U/L是 HCC 患者的保护因素。根据这些因素构建的提名图显示曲线下面积(AUC)为 0.746(灵敏度=0.710,特异性=0.646),明显高于 AFP 的 AUC 0.658(灵敏度=0.462,特异性=0.766)。与几种 ML 算法相比,XGBoost 模型的 AUC 为 0.832(灵敏度=0.745,特异度=0.766),独立验证 AUC 为 0.829(灵敏度=0.766,特异度=0.737),是两组模型中表现最好的。外部验证结果证明了 XGBoost 模型的准确性:结论:所提出的 XGBoost 模型在对 AFP 水平较低的 HBV 相关肝硬化患者的 HCC 进行个体化预测方面表现出了良好的能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Annals of hepatology
Annals of hepatology 医学-胃肠肝病学
CiteScore
7.90
自引率
2.60%
发文量
183
审稿时长
4-8 weeks
期刊介绍: Annals of Hepatology publishes original research on the biology and diseases of the liver in both humans and experimental models. Contributions may be submitted as regular articles. The journal also publishes concise reviews of both basic and clinical topics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信