Shuxian Yu, Haiyang Jiang, Jing Xia, Jie Gu, Mengting Chen, Yan Wang, Xiaohong Zhao, Zehua Liao, Puhua Zeng, Tian Xie, Xinbing Sui
{"title":"基于机器学习的胃癌前病变高危患者筛查模型构建","authors":"Shuxian Yu, Haiyang Jiang, Jing Xia, Jie Gu, Mengting Chen, Yan Wang, Xiaohong Zhao, Zehua Liao, Puhua Zeng, Tian Xie, Xinbing Sui","doi":"10.1186/s13020-025-01059-4","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The individualized prediction and discrimination of precancerous lesions of gastric cancer (PLGC) is critical for the early prevention of gastric cancer (GC). However, accurate non-invasive methods for distinguishing between PLGC and GC are currently lacking. This study therefore aimed to develop a risk prediction model by machine learning and deep learning techniques to aid the early diagnosis of GC.</p><p><strong>Methods: </strong>In this study, a total of 2229 subjects were recruited from nine tertiary hospitals between October 2022 and November 2023. We designed a comprehensive questionnaire, identified statistically significant factors, and created a web-based column chart. Then, a risk prediction model was subsequently developed by machine learning techniques. In addition, a tongue image-based risk prediction model was established by deep learning algorithms.</p><p><strong>Results: </strong>Based on logistic regression analysis, a dynamic web-based nomogram was developed and it is freely accessible at: https://yz6677.shinyapps.io/GC67/ . Then, the prediction model was established using ten different machine learning algorithms and the Random Forest (RF) model achieved the highest accuracy at 85.65%. According with the predictive results, the top 10 key risk factors were age, traditional Chinese medicine (TCM) constitution type, tongue coating color, tongue color, irregular meals, pickled food, greasy fur, over-hot eating habit, anxiety and sleep onset latency. These factors are all significant risk indicators for the progression of PLGC patients to GC patients. Subsequently, the Swin Transformer architecture was used to develop a tongue image-based model for predicting the risk for progression of PLGC. The verification set showed an accuracy of 73.33% and an area under curve (AUC) greater than 0.8 across all models.</p><p><strong>Conclusions: </strong>Our study developed machine learning and deep learning-based models for predicting the risk for progression of PLGC to GC, which will offer the assistance to determine the high-risk patients from PLGC and improve the early diagnosis of GC.</p>","PeriodicalId":10266,"journal":{"name":"Chinese Medicine","volume":"20 1","pages":"7"},"PeriodicalIF":5.3000,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11705657/pdf/","citationCount":"0","resultStr":"{\"title\":\"Construction of machine learning-based models for screening the high-risk patients with gastric precancerous lesions.\",\"authors\":\"Shuxian Yu, Haiyang Jiang, Jing Xia, Jie Gu, Mengting Chen, Yan Wang, Xiaohong Zhao, Zehua Liao, Puhua Zeng, Tian Xie, Xinbing Sui\",\"doi\":\"10.1186/s13020-025-01059-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>The individualized prediction and discrimination of precancerous lesions of gastric cancer (PLGC) is critical for the early prevention of gastric cancer (GC). However, accurate non-invasive methods for distinguishing between PLGC and GC are currently lacking. This study therefore aimed to develop a risk prediction model by machine learning and deep learning techniques to aid the early diagnosis of GC.</p><p><strong>Methods: </strong>In this study, a total of 2229 subjects were recruited from nine tertiary hospitals between October 2022 and November 2023. We designed a comprehensive questionnaire, identified statistically significant factors, and created a web-based column chart. Then, a risk prediction model was subsequently developed by machine learning techniques. In addition, a tongue image-based risk prediction model was established by deep learning algorithms.</p><p><strong>Results: </strong>Based on logistic regression analysis, a dynamic web-based nomogram was developed and it is freely accessible at: https://yz6677.shinyapps.io/GC67/ . Then, the prediction model was established using ten different machine learning algorithms and the Random Forest (RF) model achieved the highest accuracy at 85.65%. According with the predictive results, the top 10 key risk factors were age, traditional Chinese medicine (TCM) constitution type, tongue coating color, tongue color, irregular meals, pickled food, greasy fur, over-hot eating habit, anxiety and sleep onset latency. These factors are all significant risk indicators for the progression of PLGC patients to GC patients. Subsequently, the Swin Transformer architecture was used to develop a tongue image-based model for predicting the risk for progression of PLGC. The verification set showed an accuracy of 73.33% and an area under curve (AUC) greater than 0.8 across all models.</p><p><strong>Conclusions: </strong>Our study developed machine learning and deep learning-based models for predicting the risk for progression of PLGC to GC, which will offer the assistance to determine the high-risk patients from PLGC and improve the early diagnosis of GC.</p>\",\"PeriodicalId\":10266,\"journal\":{\"name\":\"Chinese Medicine\",\"volume\":\"20 1\",\"pages\":\"7\"},\"PeriodicalIF\":5.3000,\"publicationDate\":\"2025-01-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11705657/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Chinese Medicine\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s13020-025-01059-4\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"INTEGRATIVE & COMPLEMENTARY MEDICINE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chinese Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s13020-025-01059-4","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"INTEGRATIVE & COMPLEMENTARY MEDICINE","Score":null,"Total":0}
Construction of machine learning-based models for screening the high-risk patients with gastric precancerous lesions.
Background: The individualized prediction and discrimination of precancerous lesions of gastric cancer (PLGC) is critical for the early prevention of gastric cancer (GC). However, accurate non-invasive methods for distinguishing between PLGC and GC are currently lacking. This study therefore aimed to develop a risk prediction model by machine learning and deep learning techniques to aid the early diagnosis of GC.
Methods: In this study, a total of 2229 subjects were recruited from nine tertiary hospitals between October 2022 and November 2023. We designed a comprehensive questionnaire, identified statistically significant factors, and created a web-based column chart. Then, a risk prediction model was subsequently developed by machine learning techniques. In addition, a tongue image-based risk prediction model was established by deep learning algorithms.
Results: Based on logistic regression analysis, a dynamic web-based nomogram was developed and it is freely accessible at: https://yz6677.shinyapps.io/GC67/ . Then, the prediction model was established using ten different machine learning algorithms and the Random Forest (RF) model achieved the highest accuracy at 85.65%. According with the predictive results, the top 10 key risk factors were age, traditional Chinese medicine (TCM) constitution type, tongue coating color, tongue color, irregular meals, pickled food, greasy fur, over-hot eating habit, anxiety and sleep onset latency. These factors are all significant risk indicators for the progression of PLGC patients to GC patients. Subsequently, the Swin Transformer architecture was used to develop a tongue image-based model for predicting the risk for progression of PLGC. The verification set showed an accuracy of 73.33% and an area under curve (AUC) greater than 0.8 across all models.
Conclusions: Our study developed machine learning and deep learning-based models for predicting the risk for progression of PLGC to GC, which will offer the assistance to determine the high-risk patients from PLGC and improve the early diagnosis of GC.
Chinese MedicineINTEGRATIVE & COMPLEMENTARY MEDICINE-PHARMACOLOGY & PHARMACY
CiteScore
7.90
自引率
4.10%
发文量
133
审稿时长
31 weeks
期刊介绍:
Chinese Medicine is an open access, online journal publishing evidence-based, scientifically justified, and ethical research into all aspects of Chinese medicine.
Areas of interest include recent advances in herbal medicine, clinical nutrition, clinical diagnosis, acupuncture, pharmaceutics, biomedical sciences, epidemiology, education, informatics, sociology, and psychology that are relevant and significant to Chinese medicine. Examples of research approaches include biomedical experimentation, high-throughput technology, clinical trials, systematic reviews, meta-analysis, sampled surveys, simulation, data curation, statistics, omics, translational medicine, and integrative methodologies.
Chinese Medicine is a credible channel to communicate unbiased scientific data, information, and knowledge in Chinese medicine among researchers, clinicians, academics, and students in Chinese medicine and other scientific disciplines of medicine.