Using clinical and genetic risk factors for risk prediction of 8 cancers in the UK Biobank.

IF 3.4 Q2 ONCOLOGY
Jiaqi Hu, Yixuan Ye, Geyu Zhou, Hongyu Zhao
{"title":"Using clinical and genetic risk factors for risk prediction of 8 cancers in the UK Biobank.","authors":"Jiaqi Hu, Yixuan Ye, Geyu Zhou, Hongyu Zhao","doi":"10.1093/jncics/pkae008","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Models with polygenic risk scores and clinical factors to predict risk of different cancers have been developed, but these models have been limited by the polygenic risk score-derivation methods and the incomplete selection of clinical variables.</p><p><strong>Methods: </strong>We used UK Biobank to train the best polygenic risk scores for 8 cancers (bladder, breast, colorectal, kidney, lung, ovarian, pancreatic, and prostate cancers) and select relevant clinical variables from 733 baseline traits through extreme gradient boosting (XGBoost). Combining polygenic risk scores and clinical variables, we developed Cox proportional hazards models for risk prediction in these cancers.</p><p><strong>Results: </strong>Our models achieved high prediction accuracy for 8 cancers, with areas under the curve ranging from 0.618 (95% confidence interval = 0.581 to 0.655) for ovarian cancer to 0.831 (95% confidence interval = 0.817 to 0.845) for lung cancer. Additionally, our models could identify individuals at a high risk for developing cancer. For example, the risk of breast cancer for individuals in the top 5% score quantile was nearly 13 times greater than for individuals in the lowest 10%. Furthermore, we observed a higher proportion of individuals with high polygenic risk scores in the early-onset group but a higher proportion of individuals at high clinical risk in the late-onset group.</p><p><strong>Conclusion: </strong>Our models demonstrated the potential to predict cancer risk and identify high-risk individuals with great generalizability to different cancers. Our findings suggested that the polygenic risk score model is more predictive for the cancer risk of early-onset patients than for late-onset patients, while the clinical risk model is more predictive for late-onset patients. Meanwhile, combining polygenic risk scores and clinical risk factors has overall better predictive performance than using polygenic risk scores or clinical risk factors alone.</p>","PeriodicalId":14681,"journal":{"name":"JNCI Cancer Spectrum","volume":null,"pages":null},"PeriodicalIF":3.4000,"publicationDate":"2024-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10919929/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JNCI Cancer Spectrum","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/jncics/pkae008","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Models with polygenic risk scores and clinical factors to predict risk of different cancers have been developed, but these models have been limited by the polygenic risk score-derivation methods and the incomplete selection of clinical variables.

Methods: We used UK Biobank to train the best polygenic risk scores for 8 cancers (bladder, breast, colorectal, kidney, lung, ovarian, pancreatic, and prostate cancers) and select relevant clinical variables from 733 baseline traits through extreme gradient boosting (XGBoost). Combining polygenic risk scores and clinical variables, we developed Cox proportional hazards models for risk prediction in these cancers.

Results: Our models achieved high prediction accuracy for 8 cancers, with areas under the curve ranging from 0.618 (95% confidence interval = 0.581 to 0.655) for ovarian cancer to 0.831 (95% confidence interval = 0.817 to 0.845) for lung cancer. Additionally, our models could identify individuals at a high risk for developing cancer. For example, the risk of breast cancer for individuals in the top 5% score quantile was nearly 13 times greater than for individuals in the lowest 10%. Furthermore, we observed a higher proportion of individuals with high polygenic risk scores in the early-onset group but a higher proportion of individuals at high clinical risk in the late-onset group.

Conclusion: Our models demonstrated the potential to predict cancer risk and identify high-risk individuals with great generalizability to different cancers. Our findings suggested that the polygenic risk score model is more predictive for the cancer risk of early-onset patients than for late-onset patients, while the clinical risk model is more predictive for late-onset patients. Meanwhile, combining polygenic risk scores and clinical risk factors has overall better predictive performance than using polygenic risk scores or clinical risk factors alone.

利用临床和遗传风险因素对英国生物库中的八种癌症进行风险预测。
背景:目前已开发出利用多基因风险评分(PRS)和临床因素预测不同癌症风险的模型。然而,这些模型受到了PRS衍生方法和临床变量选择不全面的限制:我们利用英国生物库(UKBB)训练了八种癌症(膀胱癌、乳腺癌、结直肠癌、肾癌、肺癌、卵巢癌、胰腺癌和前列腺癌)的最佳PRS,并通过极端梯度提升(XGBoost)从733个基线特征中选择了相关临床变量。结合PRS和临床变量,我们建立了用于这些癌症风险预测的Cox比例危险模型:我们的模型对八种癌症的预测准确率很高,卵巢癌的 AUC 为 0.618(95% CI 0.581-0.655),肺癌的 AUC 为 0.831(95% CI 0.817-0.845)。此外,我们的模型还可以识别出癌症高风险人群。例如,与得分最低的 10%的受试者相比,得分最高的 5%的受试者罹患乳腺癌的风险高出近 13 倍。此外,我们还观察到早发组的高PRS人群比例较高,但晚发组的高临床风险人群比例较高:我们的模型证明了预测癌症风险和识别高危人群的潜力,并对不同癌症具有很强的普适性。我们的研究结果表明,PRS 模型对早发患者癌症风险的预测能力强于晚发患者,而临床风险模型对晚发患者的预测能力更强。同时,结合 PRS 和临床风险因素比单独使用 PRS 或临床风险因素具有更好的预测效果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
JNCI Cancer Spectrum
JNCI Cancer Spectrum Medicine-Oncology
CiteScore
7.70
自引率
0.00%
发文量
80
审稿时长
18 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信