Gökhan Koker, Gizem Zorlu Gorgulugil, Muhammed Ali Coskuner, Merve Eren Durmus
{"title":"Machine Learning-Based Prediction of Histopathological Classification in Colorectal Polyps.","authors":"Gökhan Koker, Gizem Zorlu Gorgulugil, Muhammed Ali Coskuner, Merve Eren Durmus","doi":"10.5152/tjg.2025.25542","DOIUrl":null,"url":null,"abstract":"<p><p>Background/Aims: Colorectal polyps are precursor lesions of colorectal cancer, and their histopathological types are critical for determining malignant potential. Predicting polyp histopathological types may support early and appropriate clinical management. Machine learning (ML) algorithms based on accessible demographic, clinical, and lifestyle data can contribute to individualized screening strategies. Materials and Methods: This retrospective cross-sectional study included 491 individuals who underwent colonoscopy for the first time between 2022 and 2025 at University of Health Sciences, Antalya Training and Research Hospital. Demographic and clinical data were recorded, and dietary habits were assessed using the Food Frequency Questionnaire. Patients were classified into 3 groups according to histopathology: adenomatous polyp, hyperplastic polyp, and no polyp. Four ML algorithms-decision tree, random forest, support vector machines (SVMs), and extreme gradient boosting-were applied. Model performance was evaluated using accuracy, sensitivity, specificity, kappa statistic, and McNemar's test. Variable contributions were further analyzed with SHapley Additive exPlanations. Results: Accuracy ranged from 70.9% to 76.4%, with the highest performance from SVM (76.4%) and random forest (75.7%). Extreme gradient boosting showed lower overall accuracy (70.9%) but was the only model that identified hyperplastic polyps. The no polyp group was consistently predicted with high accuracy (sensitivity 85.6%-95.9%). Precision for adenomatous polyps was highest with SVM (71.4%). SHapley Additive exPlanations analysis highlighted frequent bulgur consumption (>2 times/week), red meat intake, age, and body mass index as major predictors. Conclusion: Machine learning algorithms can predict colorectal polyp histopathological types using routine demographic, clinical, and dietary data, enabling more personalized and effective screening beyond age-based protocols.</p>","PeriodicalId":51205,"journal":{"name":"Turkish Journal of Gastroenterology","volume":"36 10","pages":"700-707"},"PeriodicalIF":1.6000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12520147/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Turkish Journal of Gastroenterology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.5152/tjg.2025.25542","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background/Aims: Colorectal polyps are precursor lesions of colorectal cancer, and their histopathological types are critical for determining malignant potential. Predicting polyp histopathological types may support early and appropriate clinical management. Machine learning (ML) algorithms based on accessible demographic, clinical, and lifestyle data can contribute to individualized screening strategies. Materials and Methods: This retrospective cross-sectional study included 491 individuals who underwent colonoscopy for the first time between 2022 and 2025 at University of Health Sciences, Antalya Training and Research Hospital. Demographic and clinical data were recorded, and dietary habits were assessed using the Food Frequency Questionnaire. Patients were classified into 3 groups according to histopathology: adenomatous polyp, hyperplastic polyp, and no polyp. Four ML algorithms-decision tree, random forest, support vector machines (SVMs), and extreme gradient boosting-were applied. Model performance was evaluated using accuracy, sensitivity, specificity, kappa statistic, and McNemar's test. Variable contributions were further analyzed with SHapley Additive exPlanations. Results: Accuracy ranged from 70.9% to 76.4%, with the highest performance from SVM (76.4%) and random forest (75.7%). Extreme gradient boosting showed lower overall accuracy (70.9%) but was the only model that identified hyperplastic polyps. The no polyp group was consistently predicted with high accuracy (sensitivity 85.6%-95.9%). Precision for adenomatous polyps was highest with SVM (71.4%). SHapley Additive exPlanations analysis highlighted frequent bulgur consumption (>2 times/week), red meat intake, age, and body mass index as major predictors. Conclusion: Machine learning algorithms can predict colorectal polyp histopathological types using routine demographic, clinical, and dietary data, enabling more personalized and effective screening beyond age-based protocols.
期刊介绍:
The Turkish Journal of Gastroenterology (Turk J Gastroenterol) is the double-blind peer-reviewed, open access, international publication organ of the Turkish Society of Gastroenterology. The journal is a bimonthly publication, published on January, March, May, July, September, November and its publication language is English.
The Turkish Journal of Gastroenterology aims to publish international at the highest clinical and scientific level on original issues of gastroenterology and hepatology. The journal publishes original papers, review articles, case reports and letters to the editor on clinical and experimental gastroenterology and hepatology.