基于机器学习技术的四自身抗体检测用于食管鳞状细胞癌的早期检测：一项多中心、嵌套病例对照研究的回顾性研究

IF 7 1区医学 Q1 MEDICINE, GENERAL & INTERNAL

BMC Medicine Pub Date : 2025-04-23 DOI:10.1186/s12916-025-04066-2

Yi-Wei Xu, Yu-Hui Peng, Can-Tong Liu, Hao Chen, Ling-Yu Chu, Hai-Lu Chen, Zhi-Yong Wu, Wen-Qiang Wei, Li-Yan Xu, Fang-Cai Wu, En-Min Li

{"title":"基于机器学习技术的四自身抗体检测用于食管鳞状细胞癌的早期检测：一项多中心、嵌套病例对照研究的回顾性研究","authors":"Yi-Wei Xu, Yu-Hui Peng, Can-Tong Liu, Hao Chen, Ling-Yu Chu, Hai-Lu Chen, Zhi-Yong Wu, Wen-Qiang Wei, Li-Yan Xu, Fang-Cai Wu, En-Min Li","doi":"10.1186/s12916-025-04066-2","DOIUrl":null,"url":null,"abstract":"Background: Autoantibodies represent promising diagnostic blood-based biomarkers that may be generated prior to the first clinically detectable signs of cancers. In present study, we aimed to identify a novel optimized autoantibody panel with high diagnostic accuracy for clinical and preclinical esophageal squamous cell carcinoma (ESCC) using machine learning (ML) algorithms.Methods: We identified potential autoantibodies against tumor-associated antigens with serological proteome analysis. Serum autoantibody levels were measured by ELISA. Using a training set (n = 531), 102 models based on ML algorithms were constructed, and Partial Least Squares Generalized Linear Models (plsRglm) was selected out using receiver operating characteristics (ROC), Kolmogorov-Smirnov (K-S) test, and Population Stability Index (PSI), and further validated through an internal validation set (n = 413), external validation set 1 (n = 371), and external validation set 2 (n = 202). Then, we validated the ability of plsRglm model in predicting preclinical ESCC by a nested case-control study (24 preclinical ESCCs and 112 matched controls) within a population-based prospective cohort study.Results: ROC analysis, K-S test, and PSI showed that plsRglm model based on four autoantibodies (ALDOA, ENO1, p53, and NY-ESO-1) exhibited the better diagnostic performance and robustness, which provided a high diagnostic accuracy in diagnosing ESCC with the respective AUCs (sensitivities and specificities) of 0.860 (68.8% and 90.4%) in the training set, 0.826 (65.3% and 89.1%) in the internal validation set, and 0.851 (69.2% and 87.3%) in the external validation set 1. For early-stage ESCC, this signature also maintained diagnostic performance [0.817 (62.3% and 90.4%) in the training set; 0.842 (62.5% and 89.1%) in the internal validation set; 0.854 (63.2% and 87.3%) in the external validation set 1; and 0.850 (67.3% and 90.1%) in the external validation set 2]. In the nested case-control study, this plsRglm model could detect the presence of preclinical ESCC with the AUC of 0.723, sensitivity of 54.2%, and specificity of 86.6%.Conclusions: Our findings indicated that the plsRglm model based on four autoantibodies might help identify preclinical and early-stage ESCC.","PeriodicalId":9188,"journal":{"name":"BMC Medicine","volume":"23 1","pages":"235"},"PeriodicalIF":7.0000,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12016149/pdf/","citationCount":"0","resultStr":"{\"title\":\"Machine learning technique-based four-autoantibody test for early detection of esophageal squamous cell carcinoma: a multicenter, retrospective study with a nested case-control study.\",\"authors\":\"Yi-Wei Xu, Yu-Hui Peng, Can-Tong Liu, Hao Chen, Ling-Yu Chu, Hai-Lu Chen, Zhi-Yong Wu, Wen-Qiang Wei, Li-Yan Xu, Fang-Cai Wu, En-Min Li\",\"doi\":\"10.1186/s12916-025-04066-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Autoantibodies represent promising diagnostic blood-based biomarkers that may be generated prior to the first clinically detectable signs of cancers. In present study, we aimed to identify a novel optimized autoantibody panel with high diagnostic accuracy for clinical and preclinical esophageal squamous cell carcinoma (ESCC) using machine learning (ML) algorithms.Methods: We identified potential autoantibodies against tumor-associated antigens with serological proteome analysis. Serum autoantibody levels were measured by ELISA. Using a training set (n = 531), 102 models based on ML algorithms were constructed, and Partial Least Squares Generalized Linear Models (plsRglm) was selected out using receiver operating characteristics (ROC), Kolmogorov-Smirnov (K-S) test, and Population Stability Index (PSI), and further validated through an internal validation set (n = 413), external validation set 1 (n = 371), and external validation set 2 (n = 202). Then, we validated the ability of plsRglm model in predicting preclinical ESCC by a nested case-control study (24 preclinical ESCCs and 112 matched controls) within a population-based prospective cohort study.Results: ROC analysis, K-S test, and PSI showed that plsRglm model based on four autoantibodies (ALDOA, ENO1, p53, and NY-ESO-1) exhibited the better diagnostic performance and robustness, which provided a high diagnostic accuracy in diagnosing ESCC with the respective AUCs (sensitivities and specificities) of 0.860 (68.8% and 90.4%) in the training set, 0.826 (65.3% and 89.1%) in the internal validation set, and 0.851 (69.2% and 87.3%) in the external validation set 1. For early-stage ESCC, this signature also maintained diagnostic performance [0.817 (62.3% and 90.4%) in the training set; 0.842 (62.5% and 89.1%) in the internal validation set; 0.854 (63.2% and 87.3%) in the external validation set 1; and 0.850 (67.3% and 90.1%) in the external validation set 2]. In the nested case-control study, this plsRglm model could detect the presence of preclinical ESCC with the AUC of 0.723, sensitivity of 54.2%, and specificity of 86.6%.Conclusions: Our findings indicated that the plsRglm model based on four autoantibodies might help identify preclinical and early-stage ESCC.\",\"PeriodicalId\":9188,\"journal\":{\"name\":\"BMC Medicine\",\"volume\":\"23 1\",\"pages\":\"235\"},\"PeriodicalIF\":7.0000,\"publicationDate\":\"2025-04-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12016149/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Medicine\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s12916-025-04066-2\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MEDICINE, GENERAL & INTERNAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12916-025-04066-2","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}

引用次数: 0

摘要

背景：自身抗体是一种很有前途的基于血液的诊断生物标志物，可能在癌症的第一个临床可检测迹象之前产生。在本研究中，我们旨在利用机器学习（ML）算法确定一种新的优化的自身抗体面板，对临床和临床前食管鳞状细胞癌（ESCC）具有较高的诊断准确性。方法：通过血清学蛋白质组学分析，鉴定肿瘤相关抗原的潜在自身抗体。ELISA法检测血清自身抗体水平。利用训练集（n = 531），构建了102个基于ML算法的模型，并通过受试者工作特征（ROC）、Kolmogorov-Smirnov （K-S）检验和种群稳定性指数（PSI）筛选出偏最小二乘广义线性模型（plsRglm），并通过内部验证集（n = 413）、外部验证集1 （n = 371）和外部验证集2 （n = 202）进一步验证。然后，我们在一项基于人群的前瞻性队列研究中，通过嵌套病例对照研究（24例临床前ESCC和112例匹配对照）验证了plsRglm模型预测临床前ESCC的能力。结果：ROC分析、K-S检验和PSI分析显示，基于ALDOA、ENO1、p53和NY-ESO-1四种自身抗体的plsRglm模型具有较好的诊断性能和鲁棒性，对ESCC的诊断准确率较高，训练集的auc（敏感性和特异性）分别为0.860(68.8%和90.4%)，内部验证集的auc（敏感性和特异性）分别为0.826(65.3%和89.1%)，外部验证集1的auc（敏感性和特异性）分别为0.851（69.2%和87.3%）。对于早期ESCC，该特征在训练集中也保持了诊断性能[0.817(62.3%和90.4%)]；0.842(62.5%和89.1%)；外部验证集1为0.854(63.2%和87.3%)；在外部验证集[2]中为0.850（67.3%和90.1%）。在嵌套病例对照研究中，该plsRglm模型能够检测出临床前ESCC的存在，AUC为0.723，敏感性为54.2%，特异性为86.6%。结论：我们的研究结果表明，基于四种自身抗体的plsRglm模型可能有助于临床前和早期ESCC的识别。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Machine learning technique-based four-autoantibody test for early detection of esophageal squamous cell carcinoma: a multicenter, retrospective study with a nested case-control study.

Background: Autoantibodies represent promising diagnostic blood-based biomarkers that may be generated prior to the first clinically detectable signs of cancers. In present study, we aimed to identify a novel optimized autoantibody panel with high diagnostic accuracy for clinical and preclinical esophageal squamous cell carcinoma (ESCC) using machine learning (ML) algorithms.

Methods: We identified potential autoantibodies against tumor-associated antigens with serological proteome analysis. Serum autoantibody levels were measured by ELISA. Using a training set (n = 531), 102 models based on ML algorithms were constructed, and Partial Least Squares Generalized Linear Models (plsRglm) was selected out using receiver operating characteristics (ROC), Kolmogorov-Smirnov (K-S) test, and Population Stability Index (PSI), and further validated through an internal validation set (n = 413), external validation set 1 (n = 371), and external validation set 2 (n = 202). Then, we validated the ability of plsRglm model in predicting preclinical ESCC by a nested case-control study (24 preclinical ESCCs and 112 matched controls) within a population-based prospective cohort study.

Results: ROC analysis, K-S test, and PSI showed that plsRglm model based on four autoantibodies (ALDOA, ENO1, p53, and NY-ESO-1) exhibited the better diagnostic performance and robustness, which provided a high diagnostic accuracy in diagnosing ESCC with the respective AUCs (sensitivities and specificities) of 0.860 (68.8% and 90.4%) in the training set, 0.826 (65.3% and 89.1%) in the internal validation set, and 0.851 (69.2% and 87.3%) in the external validation set 1. For early-stage ESCC, this signature also maintained diagnostic performance [0.817 (62.3% and 90.4%) in the training set; 0.842 (62.5% and 89.1%) in the internal validation set; 0.854 (63.2% and 87.3%) in the external validation set 1; and 0.850 (67.3% and 90.1%) in the external validation set 2]. In the nested case-control study, this plsRglm model could detect the presence of preclinical ESCC with the AUC of 0.723, sensitivity of 54.2%, and specificity of 86.6%.

Conclusions: Our findings indicated that the plsRglm model based on four autoantibodies might help identify preclinical and early-stage ESCC.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

BMC Medicine 医学-医学：内科

CiteScore

13.10

自引率

1.10%

发文量

435

审稿时长

4-8 weeks

期刊介绍： BMC Medicine is an open access, transparent peer-reviewed general medical journal. It is the flagship journal of the BMC series and publishes outstanding and influential research in various areas including clinical practice, translational medicine, medical and health advances, public health, global health, policy, and general topics of interest to the biomedical and sociomedical professional communities. In addition to research articles, the journal also publishes stimulating debates, reviews, unique forum articles, and concise tutorials. All articles published in BMC Medicine are included in various databases such as Biological Abstracts, BIOSIS, CAS, Citebase, Current contents, DOAJ, Embase, MEDLINE, PubMed, Science Citation Index Expanded, OAIster, SCImago, Scopus, SOCOLAR, and Zetoc.