IF 7.6 1区 医学 Q1 HEALTH CARE SCIENCES & SERVICES
Xingjian Xiao , Shiyou Liu , Kubra Maqsood , Xiaohan Yi , Guoqun Xie , Hailei Zhao , Bo Sun , Jianying Mao , Xianglong Xu
{"title":"Using machine learning algorithms to predict colorectal polyps","authors":"Xingjian Xiao ,&nbsp;Shiyou Liu ,&nbsp;Kubra Maqsood ,&nbsp;Xiaohan Yi ,&nbsp;Guoqun Xie ,&nbsp;Hailei Zhao ,&nbsp;Bo Sun ,&nbsp;Jianying Mao ,&nbsp;Xianglong Xu","doi":"10.1016/j.lanwpc.2024.101356","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Colorectal cancer (CRC) is the third most common cancer worldwide, and colorectal polyps (CRP) represent a necessary pathway to the development of CRC. Surveys indicate that the prevalence of colorectal polyps is 20% at age 45, increasing to over 50% to 60% by age 85 globally. In China, the prevalence of colorectal polyps among residents is approximately 18.1%, and there is a certain correlation with age: the older the age, the higher the prevalence. Until now, no studies have been conducted on utilizing non-invasive factors to predict colorectal polyps.</div></div><div><h3>Methods</h3><div>Our study was based on a population-based cross-sectional survey. We included data from 5,461 cases with colonoscopy results among 49,701 initial positive consultations in the colorectal cancer screening project conducted in Baoshan District, Shanghai, from 2013 to 2021. Multiple machine learning models including adaptive boosting classifier and gradient boosting machine were established to predict colorectal polyps. In the setting of outcome indicators, patients diagnosed with colorectal polyps through clinical colonoscopy results, pathological findings, and imaging techniques are considered to have colorectal polyps. An area under the curve (AUC) of each established model exceeding 0.7 was considered acceptable for predicting colorectal polyps. The optimal model was used to identify predictors of colorectal polyps.</div></div><div><h3>Findings</h3><div>Non-invasive predictors such as sociodemographic information, behavioural history, and medical history were used to predict the current occurrence of colorectal. In our study, the AUC of Random Forest and eXtreme Gradient Boosting reached 0.71, Adaptive Boosting Machine, Gradient Boosting Machine and Light Gradient Boosting Machine reached 0.7 in predicting the occurrence of colorectal cancer. Among the various variables predicting colorectal polyps, age, smoking, gender, cancer history, FOBT (Fecal Occult Blood Test), occupation, and education level are important predictors of colorectal polyps.</div></div><div><h3>Interpretation</h3><div>Using non-invasive factors and machine learning algorithms can accurately predict the occurrence of colorectal polyps in individuals with positive initial screening results. In the context of low colonoscopy examination rates, our machine learning predictive models may help prompt patients to undergo further examinations and interventions, thereby improve the earlier diagnosis and treatment. The rate of colonoscopy examinations is very low, even among individuals with positive initial screening results. We propose a machine learning approach that can identify individuals with colorectal polyps in this group, thereby increasing the screening rate for colorectal cancer and helping to prevent the disease.</div></div><div><h3>Funding</h3><div>This study was supported by Health Promotion and Education of the Key medical Specialty of Baoshan District, shanghai (BSZK-2023-BZ14), Traditional Chinese medicine research project of Shanghai Municipal Health Commission (20240N108), and Construction of Traditional Chinese Medicine Inheritance and innovation Development Demonstration Pilot Projects in Pudong New Area - High-Level Research-Oriented Traditional Chinese Medicine Hospital Construction (C-2023-0901).</div></div>","PeriodicalId":22792,"journal":{"name":"The Lancet Regional Health: Western Pacific","volume":"55 ","pages":"Article 101356"},"PeriodicalIF":7.6000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Lancet Regional Health: Western Pacific","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S266660652400350X","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

摘要

背景大肠癌(CRC)是全球第三大常见癌症,而大肠息肉(CRP)则是导致大肠癌的必经之路。调查显示,45 岁时结直肠息肉的患病率为 20%,到 85 岁时全球患病率将增至 50%至 60%。在中国,居民的大肠息肉患病率约为 18.1%,且与年龄有一定的相关性:年龄越大,患病率越高。到目前为止,还没有研究利用非侵入性因素来预测结直肠息肉。我们的研究基于基于人群的横断面调查,纳入了 2013 年至 2021 年上海市宝山区开展的结直肠癌筛查项目中 49,701 例初诊阳性病例中 5,461 例有结肠镜检查结果的病例数据。建立了包括自适应提升分类器和梯度提升机在内的多种机器学习模型来预测结直肠息肉。在结果指标的设定上,通过临床结肠镜检查结果、病理结果和影像学技术确诊为结直肠息肉的患者均被认为患有结直肠息肉。每个已建立模型的曲线下面积(AUC)超过 0.7 即被认为是可接受的结直肠息肉预测模型。研究结果用社会人口学信息、行为史和病史等非侵入性预测指标来预测当前结直肠息肉的发生率。在我们的研究中,在预测结直肠癌发生率方面,随机森林和极端梯度提升算法的AUC达到0.71,自适应提升算法、梯度提升算法和轻梯度提升算法的AUC达到0.7。在预测结直肠息肉的各种变量中,年龄、吸烟、性别、癌症史、粪便隐血试验(FOBT)、职业和受教育程度是预测结直肠息肉的重要因素。在结肠镜检查率较低的情况下,我们的机器学习预测模型可能有助于促使患者接受进一步检查和干预,从而提高早期诊断和治疗的效果。结肠镜检查率非常低,即使在初筛结果为阳性的人群中也是如此。我们提出了一种机器学习方法,可以识别出这一群体中患有大肠息肉的人,从而提高大肠癌筛查率,帮助预防该疾病。本研究得到了上海市宝山区医学重点专科健康促进与教育项目(BSZK-2023-BZ14)、上海市卫生委员会中医药科研项目(20240N108)、浦东新区中医药传承与创新发展示范试点项目--高水平研究型中医医院建设(C-2023-0901)的支持。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Using machine learning algorithms to predict colorectal polyps

Background

Colorectal cancer (CRC) is the third most common cancer worldwide, and colorectal polyps (CRP) represent a necessary pathway to the development of CRC. Surveys indicate that the prevalence of colorectal polyps is 20% at age 45, increasing to over 50% to 60% by age 85 globally. In China, the prevalence of colorectal polyps among residents is approximately 18.1%, and there is a certain correlation with age: the older the age, the higher the prevalence. Until now, no studies have been conducted on utilizing non-invasive factors to predict colorectal polyps.

Methods

Our study was based on a population-based cross-sectional survey. We included data from 5,461 cases with colonoscopy results among 49,701 initial positive consultations in the colorectal cancer screening project conducted in Baoshan District, Shanghai, from 2013 to 2021. Multiple machine learning models including adaptive boosting classifier and gradient boosting machine were established to predict colorectal polyps. In the setting of outcome indicators, patients diagnosed with colorectal polyps through clinical colonoscopy results, pathological findings, and imaging techniques are considered to have colorectal polyps. An area under the curve (AUC) of each established model exceeding 0.7 was considered acceptable for predicting colorectal polyps. The optimal model was used to identify predictors of colorectal polyps.

Findings

Non-invasive predictors such as sociodemographic information, behavioural history, and medical history were used to predict the current occurrence of colorectal. In our study, the AUC of Random Forest and eXtreme Gradient Boosting reached 0.71, Adaptive Boosting Machine, Gradient Boosting Machine and Light Gradient Boosting Machine reached 0.7 in predicting the occurrence of colorectal cancer. Among the various variables predicting colorectal polyps, age, smoking, gender, cancer history, FOBT (Fecal Occult Blood Test), occupation, and education level are important predictors of colorectal polyps.

Interpretation

Using non-invasive factors and machine learning algorithms can accurately predict the occurrence of colorectal polyps in individuals with positive initial screening results. In the context of low colonoscopy examination rates, our machine learning predictive models may help prompt patients to undergo further examinations and interventions, thereby improve the earlier diagnosis and treatment. The rate of colonoscopy examinations is very low, even among individuals with positive initial screening results. We propose a machine learning approach that can identify individuals with colorectal polyps in this group, thereby increasing the screening rate for colorectal cancer and helping to prevent the disease.

Funding

This study was supported by Health Promotion and Education of the Key medical Specialty of Baoshan District, shanghai (BSZK-2023-BZ14), Traditional Chinese medicine research project of Shanghai Municipal Health Commission (20240N108), and Construction of Traditional Chinese Medicine Inheritance and innovation Development Demonstration Pilot Projects in Pudong New Area - High-Level Research-Oriented Traditional Chinese Medicine Hospital Construction (C-2023-0901).
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
The Lancet Regional Health: Western Pacific
The Lancet Regional Health: Western Pacific Medicine-Pediatrics, Perinatology and Child Health
CiteScore
8.80
自引率
2.80%
发文量
305
审稿时长
11 weeks
期刊介绍: The Lancet Regional Health – Western Pacific, a gold open access journal, is an integral part of The Lancet's global initiative advocating for healthcare quality and access worldwide. It aims to advance clinical practice and health policy in the Western Pacific region, contributing to enhanced health outcomes. The journal publishes high-quality original research shedding light on clinical practice and health policy in the region. It also includes reviews, commentaries, and opinion pieces covering diverse regional health topics, such as infectious diseases, non-communicable diseases, child and adolescent health, maternal and reproductive health, aging health, mental health, the health workforce and systems, and health policy.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信