Development and validation of an integrative 54 biomarker-based risk identification model for multi-cancer in 42,666 individuals: a population-based prospective study to guide advanced screening strategies.
Renjia Zhao, Huangbo Yuan, Yanfeng Jiang, Zhenqiu Liu, Ruilin Chen, Shuo Wang, Linyao Lu, Ziyu Yuan, Zhixi Su, Qiye He, Kelin Xu, Tiejun Zhang, Li Jin, Ming Lu, Weimin Ye, Rui Liu, Chen Suo, Xingdong Chen
{"title":"Development and validation of an integrative 54 biomarker-based risk identification model for multi-cancer in 42,666 individuals: a population-based prospective study to guide advanced screening strategies.","authors":"Renjia Zhao, Huangbo Yuan, Yanfeng Jiang, Zhenqiu Liu, Ruilin Chen, Shuo Wang, Linyao Lu, Ziyu Yuan, Zhixi Su, Qiye He, Kelin Xu, Tiejun Zhang, Li Jin, Ming Lu, Weimin Ye, Rui Liu, Chen Suo, Xingdong Chen","doi":"10.1186/s40364-025-00812-z","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Early identification of high-risk individuals is crucial for optimizing cancer screening, particularly when considering expensive and invasive methods such as multi-omics technologies and endoscopic procedures. However, developing a robust, practical multi-cancer risk prediction model that integrates diverse, multi-scale data and with proper validation remains a significant challenge.</p><p><strong>Methods: </strong>We initialized the FuSion study by recruiting 42,666 participants from Taizhou, China, with a discovery cohort (n = 16,340) and an independent validation cohort (n = 26,308) after exclusion criteria. We integrated multi-scale data from 54 blood-derived biomarkers and 26 epidemiological exposures to develop a risk prediction model for five common cancers, including lung, esophageal, liver, gastric, and colorectal cancer. Employing five supervised machine learning approaches, we used a LASSO-based feature selection strategy to identify the most informative predictors. The model was trained and internally validated in the discovery cohort, externally applied in the validation cohort, and further evaluated through a prospective clinical follow-up to assess cancer events via clinical examinations.</p><p><strong>Results: </strong>The final model comprising four key biomarkers along with age, sex, and smoking intensity, achieving an AUROC of 0.767 (95% CI: 0.723-0.814) for five-year risk prediction. High-risk individuals (17.19% of the cohort) accounted for 50.42% of incident cancer cases, with a 15.19-fold increased risk compared to the low-risk group. During follow-up of 2,863 high-risk subjects, 9.64% were newly diagnosed with cancer or precancerous lesions. Notably, cancer detection in the high-risk group was 5.02 times higher than in the low-risk group and 1.74 times higher than in the intermediate-risk group. In particular, the incidence of esophageal cancers in the high-risk group was 16.84 times that of the low-risk group.</p><p><strong>Conclusions: </strong>This is the first population-based prospective study in a large Chinese cohort that leverage multi-scale data including biomarkers for multi-cancer risk prediction. Our effective risk stratification model not only enhances early cancer detection but also lays the foundation for the targeted application of advanced screening methods, including but not limited to multi-omics technologies and endoscopy. These findings support precision prevention strategies and the optimal allocation of healthcare resources.</p>","PeriodicalId":54225,"journal":{"name":"Biomarker Research","volume":"13 1","pages":"101"},"PeriodicalIF":11.5000,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12341305/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomarker Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s40364-025-00812-z","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Early identification of high-risk individuals is crucial for optimizing cancer screening, particularly when considering expensive and invasive methods such as multi-omics technologies and endoscopic procedures. However, developing a robust, practical multi-cancer risk prediction model that integrates diverse, multi-scale data and with proper validation remains a significant challenge.
Methods: We initialized the FuSion study by recruiting 42,666 participants from Taizhou, China, with a discovery cohort (n = 16,340) and an independent validation cohort (n = 26,308) after exclusion criteria. We integrated multi-scale data from 54 blood-derived biomarkers and 26 epidemiological exposures to develop a risk prediction model for five common cancers, including lung, esophageal, liver, gastric, and colorectal cancer. Employing five supervised machine learning approaches, we used a LASSO-based feature selection strategy to identify the most informative predictors. The model was trained and internally validated in the discovery cohort, externally applied in the validation cohort, and further evaluated through a prospective clinical follow-up to assess cancer events via clinical examinations.
Results: The final model comprising four key biomarkers along with age, sex, and smoking intensity, achieving an AUROC of 0.767 (95% CI: 0.723-0.814) for five-year risk prediction. High-risk individuals (17.19% of the cohort) accounted for 50.42% of incident cancer cases, with a 15.19-fold increased risk compared to the low-risk group. During follow-up of 2,863 high-risk subjects, 9.64% were newly diagnosed with cancer or precancerous lesions. Notably, cancer detection in the high-risk group was 5.02 times higher than in the low-risk group and 1.74 times higher than in the intermediate-risk group. In particular, the incidence of esophageal cancers in the high-risk group was 16.84 times that of the low-risk group.
Conclusions: This is the first population-based prospective study in a large Chinese cohort that leverage multi-scale data including biomarkers for multi-cancer risk prediction. Our effective risk stratification model not only enhances early cancer detection but also lays the foundation for the targeted application of advanced screening methods, including but not limited to multi-omics technologies and endoscopy. These findings support precision prevention strategies and the optimal allocation of healthcare resources.
Biomarker ResearchBiochemistry, Genetics and Molecular Biology-Molecular Medicine
CiteScore
15.80
自引率
1.80%
发文量
80
审稿时长
10 weeks
期刊介绍:
Biomarker Research, an open-access, peer-reviewed journal, covers all aspects of biomarker investigation. It seeks to publish original discoveries, novel concepts, commentaries, and reviews across various biomedical disciplines. The field of biomarker research has progressed significantly with the rise of personalized medicine and individual health. Biomarkers play a crucial role in drug discovery and development, as well as in disease diagnosis, treatment, prognosis, and prevention, particularly in the genome era.