Development and validation of an integrative 54 biomarker-based risk identification model for multi-cancer in 42,666 individuals: a population-based prospective study to guide advanced screening strategies.

IF 11.5 2区 医学 Q1 MEDICINE, RESEARCH & EXPERIMENTAL
Renjia Zhao, Huangbo Yuan, Yanfeng Jiang, Zhenqiu Liu, Ruilin Chen, Shuo Wang, Linyao Lu, Ziyu Yuan, Zhixi Su, Qiye He, Kelin Xu, Tiejun Zhang, Li Jin, Ming Lu, Weimin Ye, Rui Liu, Chen Suo, Xingdong Chen
{"title":"Development and validation of an integrative 54 biomarker-based risk identification model for multi-cancer in 42,666 individuals: a population-based prospective study to guide advanced screening strategies.","authors":"Renjia Zhao, Huangbo Yuan, Yanfeng Jiang, Zhenqiu Liu, Ruilin Chen, Shuo Wang, Linyao Lu, Ziyu Yuan, Zhixi Su, Qiye He, Kelin Xu, Tiejun Zhang, Li Jin, Ming Lu, Weimin Ye, Rui Liu, Chen Suo, Xingdong Chen","doi":"10.1186/s40364-025-00812-z","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Early identification of high-risk individuals is crucial for optimizing cancer screening, particularly when considering expensive and invasive methods such as multi-omics technologies and endoscopic procedures. However, developing a robust, practical multi-cancer risk prediction model that integrates diverse, multi-scale data and with proper validation remains a significant challenge.</p><p><strong>Methods: </strong>We initialized the FuSion study by recruiting 42,666 participants from Taizhou, China, with a discovery cohort (n = 16,340) and an independent validation cohort (n = 26,308) after exclusion criteria. We integrated multi-scale data from 54 blood-derived biomarkers and 26 epidemiological exposures to develop a risk prediction model for five common cancers, including lung, esophageal, liver, gastric, and colorectal cancer. Employing five supervised machine learning approaches, we used a LASSO-based feature selection strategy to identify the most informative predictors. The model was trained and internally validated in the discovery cohort, externally applied in the validation cohort, and further evaluated through a prospective clinical follow-up to assess cancer events via clinical examinations.</p><p><strong>Results: </strong>The final model comprising four key biomarkers along with age, sex, and smoking intensity, achieving an AUROC of 0.767 (95% CI: 0.723-0.814) for five-year risk prediction. High-risk individuals (17.19% of the cohort) accounted for 50.42% of incident cancer cases, with a 15.19-fold increased risk compared to the low-risk group. During follow-up of 2,863 high-risk subjects, 9.64% were newly diagnosed with cancer or precancerous lesions. Notably, cancer detection in the high-risk group was 5.02 times higher than in the low-risk group and 1.74 times higher than in the intermediate-risk group. In particular, the incidence of esophageal cancers in the high-risk group was 16.84 times that of the low-risk group.</p><p><strong>Conclusions: </strong>This is the first population-based prospective study in a large Chinese cohort that leverage multi-scale data including biomarkers for multi-cancer risk prediction. Our effective risk stratification model not only enhances early cancer detection but also lays the foundation for the targeted application of advanced screening methods, including but not limited to multi-omics technologies and endoscopy. These findings support precision prevention strategies and the optimal allocation of healthcare resources.</p>","PeriodicalId":54225,"journal":{"name":"Biomarker Research","volume":"13 1","pages":"101"},"PeriodicalIF":11.5000,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12341305/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomarker Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s40364-025-00812-z","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Early identification of high-risk individuals is crucial for optimizing cancer screening, particularly when considering expensive and invasive methods such as multi-omics technologies and endoscopic procedures. However, developing a robust, practical multi-cancer risk prediction model that integrates diverse, multi-scale data and with proper validation remains a significant challenge.

Methods: We initialized the FuSion study by recruiting 42,666 participants from Taizhou, China, with a discovery cohort (n = 16,340) and an independent validation cohort (n = 26,308) after exclusion criteria. We integrated multi-scale data from 54 blood-derived biomarkers and 26 epidemiological exposures to develop a risk prediction model for five common cancers, including lung, esophageal, liver, gastric, and colorectal cancer. Employing five supervised machine learning approaches, we used a LASSO-based feature selection strategy to identify the most informative predictors. The model was trained and internally validated in the discovery cohort, externally applied in the validation cohort, and further evaluated through a prospective clinical follow-up to assess cancer events via clinical examinations.

Results: The final model comprising four key biomarkers along with age, sex, and smoking intensity, achieving an AUROC of 0.767 (95% CI: 0.723-0.814) for five-year risk prediction. High-risk individuals (17.19% of the cohort) accounted for 50.42% of incident cancer cases, with a 15.19-fold increased risk compared to the low-risk group. During follow-up of 2,863 high-risk subjects, 9.64% were newly diagnosed with cancer or precancerous lesions. Notably, cancer detection in the high-risk group was 5.02 times higher than in the low-risk group and 1.74 times higher than in the intermediate-risk group. In particular, the incidence of esophageal cancers in the high-risk group was 16.84 times that of the low-risk group.

Conclusions: This is the first population-based prospective study in a large Chinese cohort that leverage multi-scale data including biomarkers for multi-cancer risk prediction. Our effective risk stratification model not only enhances early cancer detection but also lays the foundation for the targeted application of advanced screening methods, including but not limited to multi-omics technologies and endoscopy. These findings support precision prevention strategies and the optimal allocation of healthcare resources.

42,666例基于生物标志物的多种癌症综合风险识别模型的开发和验证:一项基于人群的前瞻性研究,以指导先进的筛查策略。
背景:早期识别高风险个体对于优化癌症筛查至关重要,特别是考虑到昂贵和侵入性的方法,如多组学技术和内窥镜手术。然而,开发一个强大的,实用的多癌症风险预测模型,整合不同的,多尺度的数据和适当的验证仍然是一个重大的挑战。方法:我们从中国台州招募了42,666名参与者来初始化FuSion研究,其中包括一个发现队列(n = 16,340)和一个独立验证队列(n = 26,308)。我们整合了来自54种血液来源生物标志物和26种流行病学暴露的多尺度数据,建立了包括肺癌、食管癌、肝癌、胃癌和结直肠癌在内的五种常见癌症的风险预测模型。采用五种监督机器学习方法,我们使用基于lasso的特征选择策略来识别信息最多的预测器。该模型在发现队列中进行培训和内部验证,在验证队列中进行外部应用,并通过前瞻性临床随访进一步评估,通过临床检查评估癌症事件。结果:最终的模型包括四个关键的生物标志物以及年龄、性别和吸烟强度,实现了0.767 (95% CI: 0.723-0.814)的五年风险预测。高危人群(占队列的17.19%)占癌症发病率的50.42%,与低风险组相比风险增加了15.19倍。随访2863例高危人群,9.64%为新诊断的癌症或癌前病变。值得注意的是,高危组的癌症检出率是低危组的5.02倍,是中危组的1.74倍。特别是,高危组食管癌的发病率是低危组的16.84倍。结论:这是首个在中国大型队列中基于人群的前瞻性研究,该研究利用包括多种癌症风险预测的生物标志物在内的多尺度数据。我们有效的风险分层模型不仅提高了癌症的早期发现,而且为包括但不限于多组学技术和内窥镜等先进筛查方法的靶向应用奠定了基础。这些发现支持精确的预防策略和医疗资源的最佳分配。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Biomarker Research
Biomarker Research Biochemistry, Genetics and Molecular Biology-Molecular Medicine
CiteScore
15.80
自引率
1.80%
发文量
80
审稿时长
10 weeks
期刊介绍: Biomarker Research, an open-access, peer-reviewed journal, covers all aspects of biomarker investigation. It seeks to publish original discoveries, novel concepts, commentaries, and reviews across various biomedical disciplines. The field of biomarker research has progressed significantly with the rise of personalized medicine and individual health. Biomarkers play a crucial role in drug discovery and development, as well as in disease diagnosis, treatment, prognosis, and prevention, particularly in the genome era.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信