基于临床实验室参数,利用机器学习算法构建结直肠癌诊断模型。

IF 2 4区 医学 Q3 GASTROENTEROLOGY & HEPATOLOGY
Journal of gastrointestinal oncology Pub Date : 2024-10-31 Epub Date: 2024-09-12 DOI:10.21037/jgo-24-516
Dengqing Si, Yu Shu, Hongbo Jiang, Xueping Lin, Qiurong Yuan, Shaotuan Deng, Wei Luo, Yangze Lin, Ju Wang, Chengxiong Zhan, Aasma Shaukat, Peter C Ambe, Shiqiong Niu, Zhaofan Luo
{"title":"基于临床实验室参数,利用机器学习算法构建结直肠癌诊断模型。","authors":"Dengqing Si, Yu Shu, Hongbo Jiang, Xueping Lin, Qiurong Yuan, Shaotuan Deng, Wei Luo, Yangze Lin, Ju Wang, Chengxiong Zhan, Aasma Shaukat, Peter C Ambe, Shiqiong Niu, Zhaofan Luo","doi":"10.21037/jgo-24-516","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Colonoscopy remains the predominant diagnostic modality for colorectal cancer (CRC), as the diagnostic performance of tumor markers in alone, particularly in the early stages of the disease, is limited. This study sought to develop a diagnostic model for CRC that integrated various laboratory parameters.</p><p><strong>Methods: </strong>One hundred patients with CRC were assigned to an experimental group while 114 with benign colorectal diseases and 101 healthy individuals were assigned to a control group. The clinical and laboratory data, including the tumor markers such as carcinoembryonic antigen (CEA), glycan carbohydrate antigen 19-9 (CA19-9), carbohydrate antigen 242 (CA242), blood count parameters, blood biochemical parameters, and coagulation parameters, were collected for each participant. Three machine-learning models [multilayered perceptron (MLP), eXtreme Gradient Boosting (XGBoost), and random forest (RF)] were used to construct CRC diagnostic models. The performance of each model was evaluated based on its area under the curve (AUC), sensitivity, and specificity.</p><p><strong>Results: </strong>There are 12 parameters: including CEA, CA19-9, CA242, absolute neutrophil value (NEUT), hemoglobin, the neutrophil/lymphocyte ratio, the platelet/lymphocyte ratio, alanine aminotransferase, alkaline phosphatase, aspartate aminotransferase, albumin, and prothrombin time, were selected to build the diagnostic model. For the validation set, the RF machine-learning model achieved the highest performance in identifying CRC [AUC: 0.902 (95% confidence interval: 0.812-0.989), accuracy: 0.803, sensitivity: 0.908, specificity: 0.772, positive predictive value: 0.664, negative predictive value: 0.890, and F1 score: 0.763]. The AUC, sensitivity, specificity, and Youden's index for the combined diagnosis of tumor markers CEA, CA19-9, and CA242 were 0.761, 0.486, 0.983, and 0.469, respectively. The RF diagnostic model showed better diagnostic efficacy than the combined diagnosis model of tumor markers CEA, CA19-9 and CA242.</p><p><strong>Conclusions: </strong>The use of machine learning combined with multiple laboratory parameters effectively improved the diagnostic efficiency of CRC and provided more accurate results for clinical diagnosis.</p>","PeriodicalId":15841,"journal":{"name":"Journal of gastrointestinal oncology","volume":null,"pages":null},"PeriodicalIF":2.0000,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11565110/pdf/","citationCount":"0","resultStr":"{\"title\":\"Construction of diagnostic models with machine-learning algorithms for colorectal cancer based on clinical laboratory parameters.\",\"authors\":\"Dengqing Si, Yu Shu, Hongbo Jiang, Xueping Lin, Qiurong Yuan, Shaotuan Deng, Wei Luo, Yangze Lin, Ju Wang, Chengxiong Zhan, Aasma Shaukat, Peter C Ambe, Shiqiong Niu, Zhaofan Luo\",\"doi\":\"10.21037/jgo-24-516\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Colonoscopy remains the predominant diagnostic modality for colorectal cancer (CRC), as the diagnostic performance of tumor markers in alone, particularly in the early stages of the disease, is limited. This study sought to develop a diagnostic model for CRC that integrated various laboratory parameters.</p><p><strong>Methods: </strong>One hundred patients with CRC were assigned to an experimental group while 114 with benign colorectal diseases and 101 healthy individuals were assigned to a control group. The clinical and laboratory data, including the tumor markers such as carcinoembryonic antigen (CEA), glycan carbohydrate antigen 19-9 (CA19-9), carbohydrate antigen 242 (CA242), blood count parameters, blood biochemical parameters, and coagulation parameters, were collected for each participant. Three machine-learning models [multilayered perceptron (MLP), eXtreme Gradient Boosting (XGBoost), and random forest (RF)] were used to construct CRC diagnostic models. The performance of each model was evaluated based on its area under the curve (AUC), sensitivity, and specificity.</p><p><strong>Results: </strong>There are 12 parameters: including CEA, CA19-9, CA242, absolute neutrophil value (NEUT), hemoglobin, the neutrophil/lymphocyte ratio, the platelet/lymphocyte ratio, alanine aminotransferase, alkaline phosphatase, aspartate aminotransferase, albumin, and prothrombin time, were selected to build the diagnostic model. For the validation set, the RF machine-learning model achieved the highest performance in identifying CRC [AUC: 0.902 (95% confidence interval: 0.812-0.989), accuracy: 0.803, sensitivity: 0.908, specificity: 0.772, positive predictive value: 0.664, negative predictive value: 0.890, and F1 score: 0.763]. The AUC, sensitivity, specificity, and Youden's index for the combined diagnosis of tumor markers CEA, CA19-9, and CA242 were 0.761, 0.486, 0.983, and 0.469, respectively. The RF diagnostic model showed better diagnostic efficacy than the combined diagnosis model of tumor markers CEA, CA19-9 and CA242.</p><p><strong>Conclusions: </strong>The use of machine learning combined with multiple laboratory parameters effectively improved the diagnostic efficiency of CRC and provided more accurate results for clinical diagnosis.</p>\",\"PeriodicalId\":15841,\"journal\":{\"name\":\"Journal of gastrointestinal oncology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2024-10-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11565110/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of gastrointestinal oncology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.21037/jgo-24-516\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/9/12 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"GASTROENTEROLOGY & HEPATOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of gastrointestinal oncology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.21037/jgo-24-516","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/9/12 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

背景:结肠镜检查仍是结直肠癌(CRC)的主要诊断方式,因为肿瘤标志物的单独诊断性能有限,尤其是在疾病的早期阶段。本研究试图建立一个综合各种实验室参数的 CRC 诊断模型:方法:100 名 CRC 患者被分配到实验组,114 名良性结直肠疾病患者和 101 名健康人被分配到对照组。收集每位参与者的临床和实验室数据,包括肿瘤标志物,如癌胚抗原(CEA)、糖类碳水化合物抗原 19-9(CA19-9)、碳水化合物抗原 242(CA242)、血细胞计数参数、血液生化参数和凝血参数。三种机器学习模型[多层感知器(MLP)、极梯度提升(XGBoost)和随机森林(RF)]被用来构建 CRC 诊断模型。每个模型的性能根据其曲线下面积(AUC)、灵敏度和特异性进行评估:结果:共选择了 12 个参数建立诊断模型,包括 CEA、CA19-9、CA242、中性粒细胞绝对值(NEUT)、血红蛋白、中性粒细胞/淋巴细胞比值、血小板/淋巴细胞比值、丙氨酸氨基转移酶、碱性磷酸酶、天门冬氨酸氨基转移酶、白蛋白和凝血酶原时间。在验证集中,RF 机器学习模型在识别 CRC 方面取得了最高的性能[AUC:AUC:0.902(95% 置信区间:0.812-0.989),准确度:0.803,灵敏度:0.908,特异性:0.772,阳性预测值:0.664,阴性预测值:0.890,F1 评分:0.763]。肿瘤标志物CEA、CA19-9和CA242联合诊断的AUC、敏感性、特异性和Youden指数分别为0.761、0.486、0.983和0.469。RF诊断模型的诊断效果优于肿瘤标志物CEA、CA19-9和CA242的联合诊断模型:结论:机器学习与多种实验室参数的结合使用有效提高了 CRC 的诊断效率,为临床诊断提供了更准确的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Construction of diagnostic models with machine-learning algorithms for colorectal cancer based on clinical laboratory parameters.

Background: Colonoscopy remains the predominant diagnostic modality for colorectal cancer (CRC), as the diagnostic performance of tumor markers in alone, particularly in the early stages of the disease, is limited. This study sought to develop a diagnostic model for CRC that integrated various laboratory parameters.

Methods: One hundred patients with CRC were assigned to an experimental group while 114 with benign colorectal diseases and 101 healthy individuals were assigned to a control group. The clinical and laboratory data, including the tumor markers such as carcinoembryonic antigen (CEA), glycan carbohydrate antigen 19-9 (CA19-9), carbohydrate antigen 242 (CA242), blood count parameters, blood biochemical parameters, and coagulation parameters, were collected for each participant. Three machine-learning models [multilayered perceptron (MLP), eXtreme Gradient Boosting (XGBoost), and random forest (RF)] were used to construct CRC diagnostic models. The performance of each model was evaluated based on its area under the curve (AUC), sensitivity, and specificity.

Results: There are 12 parameters: including CEA, CA19-9, CA242, absolute neutrophil value (NEUT), hemoglobin, the neutrophil/lymphocyte ratio, the platelet/lymphocyte ratio, alanine aminotransferase, alkaline phosphatase, aspartate aminotransferase, albumin, and prothrombin time, were selected to build the diagnostic model. For the validation set, the RF machine-learning model achieved the highest performance in identifying CRC [AUC: 0.902 (95% confidence interval: 0.812-0.989), accuracy: 0.803, sensitivity: 0.908, specificity: 0.772, positive predictive value: 0.664, negative predictive value: 0.890, and F1 score: 0.763]. The AUC, sensitivity, specificity, and Youden's index for the combined diagnosis of tumor markers CEA, CA19-9, and CA242 were 0.761, 0.486, 0.983, and 0.469, respectively. The RF diagnostic model showed better diagnostic efficacy than the combined diagnosis model of tumor markers CEA, CA19-9 and CA242.

Conclusions: The use of machine learning combined with multiple laboratory parameters effectively improved the diagnostic efficiency of CRC and provided more accurate results for clinical diagnosis.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
3.20
自引率
0.00%
发文量
171
期刊介绍: ournal of Gastrointestinal Oncology (Print ISSN 2078-6891; Online ISSN 2219-679X; J Gastrointest Oncol; JGO), the official journal of Society for Gastrointestinal Oncology (SGO), is an open-access, international peer-reviewed journal. It is published quarterly (Sep. 2010- Dec. 2013), bimonthly (Feb. 2014 -) and openly distributed worldwide. JGO publishes manuscripts that focus on updated and practical information about diagnosis, prevention and clinical investigations of gastrointestinal cancer treatment. Specific areas of interest include, but not limited to, multimodality therapy, markers, imaging and tumor biology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信