Construction and evaluation of a liver cancer risk prediction model based on machine learning.

IF 2.5 4区 医学 Q2 GASTROENTEROLOGY & HEPATOLOGY
Ying-Ying Wang, Wan-Xia Yang, Qia-Jun Du, Zhen-Hua Liu, Ming-Hua Lu, Chong-Ge You
{"title":"Construction and evaluation of a liver cancer risk prediction model based on machine learning.","authors":"Ying-Ying Wang, Wan-Xia Yang, Qia-Jun Du, Zhen-Hua Liu, Ming-Hua Lu, Chong-Ge You","doi":"10.4251/wjgo.v16.i9.3839","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Liver cancer is one of the most prevalent malignant tumors worldwide, and its early detection and treatment are crucial for enhancing patient survival rates and quality of life. However, the early symptoms of liver cancer are often not obvious, resulting in a late-stage diagnosis in many patients, which significantly reduces the effectiveness of treatment. Developing a highly targeted, widely applicable, and practical risk prediction model for liver cancer is crucial for enhancing the early diagnosis and long-term survival rates among affected individuals.</p><p><strong>Aim: </strong>To develop a liver cancer risk prediction model by employing machine learning techniques, and subsequently assess its performance.</p><p><strong>Methods: </strong>In this study, a total of 550 patients were enrolled, with 190 hepatocellular carcinoma (HCC) and 195 cirrhosis patients serving as the training cohort, and 83 HCC and 82 cirrhosis patients forming the validation cohort. Logistic regression (LR), support vector machine (SVM), random forest (RF), and least absolute shrinkage and selection operator (LASSO) regression models were developed in the training cohort. Model performance was assessed in the validation cohort. Additionally, this study conducted a comparative evaluation of the diagnostic efficacy between the ASAP model and the model developed in this study using receiver operating characteristic curve, calibration curve, and decision curve analysis (DCA) to determine the optimal predictive model for assessing liver cancer risk.</p><p><strong>Results: </strong>Six variables including age, white blood cell, red blood cell, platelet counts, alpha-fetoprotein and protein induced by vitamin K absence or antagonist II levels were used to develop LR, SVM, RF, and LASSO regression models. The RF model exhibited superior discrimination, and the area under curve of the training and validation sets was 0.969 and 0.858, respectively. These values significantly surpassed those of the LR (0.850 and 0.827), SVM (0.860 and 0.803), LASSO regression (0.845 and 0.831), and ASAP (0.866 and 0.813) models. Furthermore, calibration and DCA indicated that the RF model exhibited robust calibration and clinical validity.</p><p><strong>Conclusion: </strong>The RF model demonstrated excellent prediction capabilities for HCC and can facilitate early diagnosis of HCC in clinical practice.</p>","PeriodicalId":23762,"journal":{"name":"World Journal of Gastrointestinal Oncology","volume":null,"pages":null},"PeriodicalIF":2.5000,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11438789/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"World Journal of Gastrointestinal Oncology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.4251/wjgo.v16.i9.3839","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Liver cancer is one of the most prevalent malignant tumors worldwide, and its early detection and treatment are crucial for enhancing patient survival rates and quality of life. However, the early symptoms of liver cancer are often not obvious, resulting in a late-stage diagnosis in many patients, which significantly reduces the effectiveness of treatment. Developing a highly targeted, widely applicable, and practical risk prediction model for liver cancer is crucial for enhancing the early diagnosis and long-term survival rates among affected individuals.

Aim: To develop a liver cancer risk prediction model by employing machine learning techniques, and subsequently assess its performance.

Methods: In this study, a total of 550 patients were enrolled, with 190 hepatocellular carcinoma (HCC) and 195 cirrhosis patients serving as the training cohort, and 83 HCC and 82 cirrhosis patients forming the validation cohort. Logistic regression (LR), support vector machine (SVM), random forest (RF), and least absolute shrinkage and selection operator (LASSO) regression models were developed in the training cohort. Model performance was assessed in the validation cohort. Additionally, this study conducted a comparative evaluation of the diagnostic efficacy between the ASAP model and the model developed in this study using receiver operating characteristic curve, calibration curve, and decision curve analysis (DCA) to determine the optimal predictive model for assessing liver cancer risk.

Results: Six variables including age, white blood cell, red blood cell, platelet counts, alpha-fetoprotein and protein induced by vitamin K absence or antagonist II levels were used to develop LR, SVM, RF, and LASSO regression models. The RF model exhibited superior discrimination, and the area under curve of the training and validation sets was 0.969 and 0.858, respectively. These values significantly surpassed those of the LR (0.850 and 0.827), SVM (0.860 and 0.803), LASSO regression (0.845 and 0.831), and ASAP (0.866 and 0.813) models. Furthermore, calibration and DCA indicated that the RF model exhibited robust calibration and clinical validity.

Conclusion: The RF model demonstrated excellent prediction capabilities for HCC and can facilitate early diagnosis of HCC in clinical practice.

构建和评估基于机器学习的肝癌风险预测模型。
背景:肝癌是全球发病率最高的恶性肿瘤之一,早期发现和治疗对提高患者的生存率和生活质量至关重要。然而,肝癌的早期症状往往并不明显,导致许多患者被诊断为晚期,从而大大降低了治疗效果。目的:通过机器学习技术开发肝癌风险预测模型,并对其性能进行评估:本研究共招募了 550 名患者,其中 190 名肝细胞癌(HCC)患者和 195 名肝硬化患者为训练队列,83 名肝细胞癌患者和 82 名肝硬化患者为验证队列。在训练队列中开发了逻辑回归(LR)、支持向量机(SVM)、随机森林(RF)和最小绝对收缩和选择算子(LASSO)回归模型。在验证队列中对模型性能进行了评估。此外,本研究还使用接收器工作特征曲线、校准曲线和决策曲线分析(DCA)对 ASAP 模型和本研究开发的模型的诊断效果进行了比较评估,以确定评估肝癌风险的最佳预测模型:利用年龄、白细胞、红细胞、血小板计数、甲胎蛋白和维生素 K 缺乏或拮抗剂 II 水平诱导的蛋白质等六个变量建立了 LR、SVM、RF 和 LASSO 回归模型。RF 模型表现出更高的区分度,训练集和验证集的曲线下面积分别为 0.969 和 0.858。这些值明显超过了 LR 模型(0.850 和 0.827)、SVM 模型(0.860 和 0.803)、LASSO 回归模型(0.845 和 0.831)和 ASAP 模型(0.866 和 0.813)。此外,校准和 DCA 表明 RF 模型具有稳健的校准和临床有效性:RF模型对HCC具有出色的预测能力,有助于临床实践中对HCC的早期诊断。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
World Journal of Gastrointestinal Oncology
World Journal of Gastrointestinal Oncology Medicine-Gastroenterology
CiteScore
4.20
自引率
3.30%
发文量
1082
期刊介绍: The World Journal of Gastrointestinal Oncology (WJGO) is a leading academic journal devoted to reporting the latest, cutting-edge research progress and findings of basic research and clinical practice in the field of gastrointestinal oncology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信