Federated deep learning enables cancer subtyping by proteomics.

IF 29.7 1区 医学 Q1 ONCOLOGY
Zhaoxiang Cai, Emma L Boys, Zainab Noor, Adel T Aref, Dylan Xavier, Natasha Lucas, Steven G Williams, Jennifer Ms Koh, Rebecca C Poulos, Yangxiu Wu, Michael Dausmann, Karen L MacKenzie, Adriana Aguilar-Mahecha, Carolina Armengol, Maria M Barranco, Mark Basik, Elise D Bowman, Roderick Clifton-Bligh, Elizabeth A Connolly, Wendy A Cooper, Bhavik Dalal, Anna DeFazio, Martin Filipits, Peter J Flynn, J Dinny Graham, Jacob George, Anthony J Gill, Michael Gnant, Rosemary Habib, Curtis C Harris, Kate Harvey, Lisa G Horvath, Christopher Jackson, Maija R J Kohonen-Corish, Elgene Lim, Jia Jenny Liu, Georgina V Long, Reginald V Lord, Graham J Mann, Geoffrey W McCaughan, Lucy Morgan, Leigh Murphy, Sumanth Nagabushan, Adnan Nagrial, Jordi Navinés, Benedict J Panizza, Jaswinder S Samra, Richard A Scolyer, John Souglakos, Alexander Swarbrick, David Thomas, Rosemary L Balleine, Peter G Hains, Phillip J Robinson, Qing Zhong, Roger R Reddel
{"title":"Federated deep learning enables cancer subtyping by proteomics.","authors":"Zhaoxiang Cai, Emma L Boys, Zainab Noor, Adel T Aref, Dylan Xavier, Natasha Lucas, Steven G Williams, Jennifer Ms Koh, Rebecca C Poulos, Yangxiu Wu, Michael Dausmann, Karen L MacKenzie, Adriana Aguilar-Mahecha, Carolina Armengol, Maria M Barranco, Mark Basik, Elise D Bowman, Roderick Clifton-Bligh, Elizabeth A Connolly, Wendy A Cooper, Bhavik Dalal, Anna DeFazio, Martin Filipits, Peter J Flynn, J Dinny Graham, Jacob George, Anthony J Gill, Michael Gnant, Rosemary Habib, Curtis C Harris, Kate Harvey, Lisa G Horvath, Christopher Jackson, Maija R J Kohonen-Corish, Elgene Lim, Jia Jenny Liu, Georgina V Long, Reginald V Lord, Graham J Mann, Geoffrey W McCaughan, Lucy Morgan, Leigh Murphy, Sumanth Nagabushan, Adnan Nagrial, Jordi Navinés, Benedict J Panizza, Jaswinder S Samra, Richard A Scolyer, John Souglakos, Alexander Swarbrick, David Thomas, Rosemary L Balleine, Peter G Hains, Phillip J Robinson, Qing Zhong, Roger R Reddel","doi":"10.1158/2159-8290.CD-24-1488","DOIUrl":null,"url":null,"abstract":"<p><p>Artificial intelligence applications in biomedicine face major challenges from data privacy requirements. To address this issue for clinically annotated tissue proteomic data, we developed a Federated Deep Learning (FDL) approach (ProCanFDL), training local models on simulated sites containing data from a pan-cancer cohort (n=1,260) and 29 cohorts held behind private firewalls (n=6,265), representing 19,930 replicate data-independent acquisition mass spectrometry (DIA-MS) runs. Local parameter updates were aggregated to build the global model, achieving a 43% performance gain on the hold-out test set (n=625) in 14 cancer subtyping tasks compared to local models, and matching centralized model performance. The approach's generalizability was demonstrated by retraining the global model with data from two external DIA-MS cohorts (n=55) and eight acquired by tandem mass tag (TMT) proteomics (n=832). ProCanFDL presents a solution for internationally collaborative machine learning initiatives using proteomic data, e.g., for discovering predictive biomarkers or treatment targets, while maintaining data privacy.</p>","PeriodicalId":9430,"journal":{"name":"Cancer discovery","volume":" ","pages":""},"PeriodicalIF":29.7000,"publicationDate":"2025-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cancer discovery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1158/2159-8290.CD-24-1488","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Artificial intelligence applications in biomedicine face major challenges from data privacy requirements. To address this issue for clinically annotated tissue proteomic data, we developed a Federated Deep Learning (FDL) approach (ProCanFDL), training local models on simulated sites containing data from a pan-cancer cohort (n=1,260) and 29 cohorts held behind private firewalls (n=6,265), representing 19,930 replicate data-independent acquisition mass spectrometry (DIA-MS) runs. Local parameter updates were aggregated to build the global model, achieving a 43% performance gain on the hold-out test set (n=625) in 14 cancer subtyping tasks compared to local models, and matching centralized model performance. The approach's generalizability was demonstrated by retraining the global model with data from two external DIA-MS cohorts (n=55) and eight acquired by tandem mass tag (TMT) proteomics (n=832). ProCanFDL presents a solution for internationally collaborative machine learning initiatives using proteomic data, e.g., for discovering predictive biomarkers or treatment targets, while maintaining data privacy.

联合深度学习可以通过蛋白质组学实现癌症亚型。
人工智能在生物医学领域的应用面临着数据隐私要求的重大挑战。为了解决临床注释组织蛋白质组学数据的这个问题,我们开发了一种联邦深度学习(FDL)方法(ProCanFDL),在包含泛癌症队列(n=1,260)和私有防火墙后的29个队列(n=6,265)数据的模拟站点上训练本地模型,代表19,930个重复数据独立采集质谱(DIA-MS)运行。汇总局部参数更新以构建全局模型,与局部模型相比,在14个癌症亚型任务的保留测试集(n=625)上实现了43%的性能提升,并与集中式模型性能相匹配。通过使用来自两个外部DIA-MS队列(n=55)和8个串联质量标签(TMT)蛋白质组学(n=832)的数据对全局模型进行再训练,证明了该方法的可泛化性。ProCanFDL为使用蛋白质组学数据的国际合作机器学习计划提供了解决方案,例如,用于发现预测性生物标志物或治疗目标,同时保持数据隐私。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Cancer discovery
Cancer discovery ONCOLOGY-
CiteScore
22.90
自引率
1.40%
发文量
838
审稿时长
6-12 weeks
期刊介绍: Cancer Discovery publishes high-impact, peer-reviewed articles detailing significant advances in both research and clinical trials. Serving as a premier cancer information resource, the journal also features Review Articles, Perspectives, Commentaries, News stories, and Research Watch summaries to keep readers abreast of the latest findings in the field. Covering a wide range of topics, from laboratory research to clinical trials and epidemiologic studies, Cancer Discovery spans the entire spectrum of cancer research and medicine.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信