{"title":"基于机器学习算法的结肠镜插入困难预测模型的构建和验证。","authors":"Ren-Xuan Gao, Xin-Lei Wang, Ming-Jie Tian, Xiao-Ming Li, Jia-Jia Zhang, Jun-Jing Wang, Jing Gao, Chao Zhang, Zhi-Ting Li","doi":"10.4253/wjge.v17.i7.108307","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Difficulty of colonoscopy insertion (DCI) significantly affects colonoscopy effectiveness and serves as a key quality indicator. Predicting and evaluating DCI risk preoperatively is crucial for optimizing intraoperative strategies.</p><p><strong>Aim: </strong>To evaluate the predictive performance of machine learning (ML) algorithms for DCI by comparing three modeling approaches, identify factors influencing DCI, and develop a preoperative prediction model using ML algorithms to enhance colonoscopy quality and efficiency.</p><p><strong>Methods: </strong>This cross-sectional study enrolled 712 patients who underwent colonoscopy at a tertiary hospital between June 2020 and May 2021. Demographic data, past medical history, medication use, and psychological status were collected. The endoscopist assessed DCI using the visual analogue scale. After univariate screening, predictive models were developed using multivariable logistic regression, least absolute shrinkage and selection operator (LASSO) regression, and random forest (RF) algorithms. Model performance was evaluated based on discrimination, calibration, and decision curve analysis (DCA), and results were visualized using nomograms.</p><p><strong>Results: </strong>A total of 712 patients (53.8% male; mean age 54.5 years ± 12.9 years) were included. Logistic regression analysis identified constipation [odds ratio (OR) = 2.254, 95% confidence interval (CI): 1.289-3.931], abdominal circumference (AC) (77.5-91.9 cm, OR = 1.895, 95%CI: 1.065-3.350; AC ≥ 92 cm, OR = 1.271, 95%CI: 0.730-2.188), and anxiety (OR = 1.071, 95%CI: 1.044-1.100) as predictive factors for DCI, validated by LASSO and RF methods. Model performance revealed training/validation sensitivities of 0.826/0.925, 0.924/0.868, and 1.000/0.981; specificities of 0.602/0.511, 0.510/0.562, and 0.977/0.526; and corresponding area under the receiver operating characteristic curves (AUCs) of 0.780 (0.737-0.823)/0.726 (0.654-0.799), 0.754 (0.710-0.798)/0.723 (0.656-0.791), and 1.000 (1.000-1.000)/0.754 (0.688-0.820), respectively. DCA indicated optimal net benefit within probability thresholds of 0-0.9 and 0.05-0.37. The RF model demonstrated superior diagnostic accuracy, reflected by perfect training sensitivity (1.000) and highest validation AUC (0.754), outperforming other methods in clinical applicability.</p><p><strong>Conclusion: </strong>The RF-based model exhibited superior predictive accuracy for DCI compared to multivariable logistic and LASSO regression models. This approach supports individualized preoperative optimization, enhancing colonoscopy quality through targeted risk stratification.</p>","PeriodicalId":23953,"journal":{"name":"World Journal of Gastrointestinal Endoscopy","volume":"17 7","pages":"108307"},"PeriodicalIF":1.8000,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12264806/pdf/","citationCount":"0","resultStr":"{\"title\":\"Construction and validation of a machine learning algorithm-based predictive model for difficult colonoscopy insertion.\",\"authors\":\"Ren-Xuan Gao, Xin-Lei Wang, Ming-Jie Tian, Xiao-Ming Li, Jia-Jia Zhang, Jun-Jing Wang, Jing Gao, Chao Zhang, Zhi-Ting Li\",\"doi\":\"10.4253/wjge.v17.i7.108307\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Difficulty of colonoscopy insertion (DCI) significantly affects colonoscopy effectiveness and serves as a key quality indicator. Predicting and evaluating DCI risk preoperatively is crucial for optimizing intraoperative strategies.</p><p><strong>Aim: </strong>To evaluate the predictive performance of machine learning (ML) algorithms for DCI by comparing three modeling approaches, identify factors influencing DCI, and develop a preoperative prediction model using ML algorithms to enhance colonoscopy quality and efficiency.</p><p><strong>Methods: </strong>This cross-sectional study enrolled 712 patients who underwent colonoscopy at a tertiary hospital between June 2020 and May 2021. Demographic data, past medical history, medication use, and psychological status were collected. The endoscopist assessed DCI using the visual analogue scale. After univariate screening, predictive models were developed using multivariable logistic regression, least absolute shrinkage and selection operator (LASSO) regression, and random forest (RF) algorithms. Model performance was evaluated based on discrimination, calibration, and decision curve analysis (DCA), and results were visualized using nomograms.</p><p><strong>Results: </strong>A total of 712 patients (53.8% male; mean age 54.5 years ± 12.9 years) were included. Logistic regression analysis identified constipation [odds ratio (OR) = 2.254, 95% confidence interval (CI): 1.289-3.931], abdominal circumference (AC) (77.5-91.9 cm, OR = 1.895, 95%CI: 1.065-3.350; AC ≥ 92 cm, OR = 1.271, 95%CI: 0.730-2.188), and anxiety (OR = 1.071, 95%CI: 1.044-1.100) as predictive factors for DCI, validated by LASSO and RF methods. Model performance revealed training/validation sensitivities of 0.826/0.925, 0.924/0.868, and 1.000/0.981; specificities of 0.602/0.511, 0.510/0.562, and 0.977/0.526; and corresponding area under the receiver operating characteristic curves (AUCs) of 0.780 (0.737-0.823)/0.726 (0.654-0.799), 0.754 (0.710-0.798)/0.723 (0.656-0.791), and 1.000 (1.000-1.000)/0.754 (0.688-0.820), respectively. DCA indicated optimal net benefit within probability thresholds of 0-0.9 and 0.05-0.37. The RF model demonstrated superior diagnostic accuracy, reflected by perfect training sensitivity (1.000) and highest validation AUC (0.754), outperforming other methods in clinical applicability.</p><p><strong>Conclusion: </strong>The RF-based model exhibited superior predictive accuracy for DCI compared to multivariable logistic and LASSO regression models. This approach supports individualized preoperative optimization, enhancing colonoscopy quality through targeted risk stratification.</p>\",\"PeriodicalId\":23953,\"journal\":{\"name\":\"World Journal of Gastrointestinal Endoscopy\",\"volume\":\"17 7\",\"pages\":\"108307\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2025-07-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12264806/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"World Journal of Gastrointestinal Endoscopy\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.4253/wjge.v17.i7.108307\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"GASTROENTEROLOGY & HEPATOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"World Journal of Gastrointestinal Endoscopy","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.4253/wjge.v17.i7.108307","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
背景:结肠镜插入难度(DCI)显著影响结肠镜检查的效果,是结肠镜检查质量的关键指标。术前预测和评估DCI风险对于优化术中策略至关重要。目的:通过比较三种建模方法,评估机器学习(ML)算法对DCI的预测性能,识别影响DCI的因素,并利用ML算法建立术前预测模型,以提高结肠镜检查的质量和效率。方法:本横断面研究纳入了2020年6月至2021年5月在一家三级医院接受结肠镜检查的712例患者。收集了人口统计资料、既往病史、用药情况和心理状况。内窥镜医师使用视觉模拟量表评估DCI。在单变量筛选后,使用多变量逻辑回归、最小绝对收缩和选择算子(LASSO)回归和随机森林(RF)算法建立预测模型。基于判别、校准和决策曲线分析(DCA)对模型性能进行评估,并使用模态图将结果可视化。结果:共712例患者,其中男性53.8%;平均年龄(54.5岁±12.9岁)。Logistic回归分析发现便秘[优势比(OR) = 2.254, 95%可信区间(CI): 1.289-3.931],腹围(AC) (77.5-91.9 cm, OR = 1.895, 95%CI: 1.065-3.350;AC≥92 cm, OR = 1.271, 95%CI: 0.730-2.188)和焦虑(OR = 1.071, 95%CI: 1.044-1.100)是DCI的预测因素,LASSO和RF方法验证了这一点。模型的训练/验证灵敏度分别为0.826/0.925、0.924/0.868和1.000/0.981;特异性分别为0.602/0.511、0.510/0.562和0.977/0.526;受试者工作特征曲线(aus)下对应面积分别为0.780(0.737-0.823)/0.726(0.654-0.799)、0.754(0.710-0.798)/0.723(0.656-0.791)、1.000(1.000-1.000)/0.754(0.688-0.820)。DCA在0-0.9和0.05-0.37的概率阈值范围内显示最佳净效益。该模型具有较好的训练灵敏度(1.000)和最高的验证AUC(0.754),具有较好的诊断准确性,临床适用性优于其他方法。结论:与多变量logistic和LASSO回归模型相比,基于rf的模型对DCI的预测精度更高。该方法支持个体化术前优化,通过有针对性的风险分层提高结肠镜检查质量。
Construction and validation of a machine learning algorithm-based predictive model for difficult colonoscopy insertion.
Background: Difficulty of colonoscopy insertion (DCI) significantly affects colonoscopy effectiveness and serves as a key quality indicator. Predicting and evaluating DCI risk preoperatively is crucial for optimizing intraoperative strategies.
Aim: To evaluate the predictive performance of machine learning (ML) algorithms for DCI by comparing three modeling approaches, identify factors influencing DCI, and develop a preoperative prediction model using ML algorithms to enhance colonoscopy quality and efficiency.
Methods: This cross-sectional study enrolled 712 patients who underwent colonoscopy at a tertiary hospital between June 2020 and May 2021. Demographic data, past medical history, medication use, and psychological status were collected. The endoscopist assessed DCI using the visual analogue scale. After univariate screening, predictive models were developed using multivariable logistic regression, least absolute shrinkage and selection operator (LASSO) regression, and random forest (RF) algorithms. Model performance was evaluated based on discrimination, calibration, and decision curve analysis (DCA), and results were visualized using nomograms.
Results: A total of 712 patients (53.8% male; mean age 54.5 years ± 12.9 years) were included. Logistic regression analysis identified constipation [odds ratio (OR) = 2.254, 95% confidence interval (CI): 1.289-3.931], abdominal circumference (AC) (77.5-91.9 cm, OR = 1.895, 95%CI: 1.065-3.350; AC ≥ 92 cm, OR = 1.271, 95%CI: 0.730-2.188), and anxiety (OR = 1.071, 95%CI: 1.044-1.100) as predictive factors for DCI, validated by LASSO and RF methods. Model performance revealed training/validation sensitivities of 0.826/0.925, 0.924/0.868, and 1.000/0.981; specificities of 0.602/0.511, 0.510/0.562, and 0.977/0.526; and corresponding area under the receiver operating characteristic curves (AUCs) of 0.780 (0.737-0.823)/0.726 (0.654-0.799), 0.754 (0.710-0.798)/0.723 (0.656-0.791), and 1.000 (1.000-1.000)/0.754 (0.688-0.820), respectively. DCA indicated optimal net benefit within probability thresholds of 0-0.9 and 0.05-0.37. The RF model demonstrated superior diagnostic accuracy, reflected by perfect training sensitivity (1.000) and highest validation AUC (0.754), outperforming other methods in clinical applicability.
Conclusion: The RF-based model exhibited superior predictive accuracy for DCI compared to multivariable logistic and LASSO regression models. This approach supports individualized preoperative optimization, enhancing colonoscopy quality through targeted risk stratification.