FedKD-CPI: Combining the federated knowledge distillation technique to accomplish synergistic compound-protein interaction prediction

IF 4.2 3区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS
Xuetao Wang , Qichang Zhao , Jianxin Wang
{"title":"FedKD-CPI: Combining the federated knowledge distillation technique to accomplish synergistic compound-protein interaction prediction","authors":"Xuetao Wang ,&nbsp;Qichang Zhao ,&nbsp;Jianxin Wang","doi":"10.1016/j.ymeth.2024.12.014","DOIUrl":null,"url":null,"abstract":"<div><div>Compound-protein interaction (CPI) prediction is critical in the early stages of drug discovery, narrowing the search space for CPIs and reducing the cost and time required for traditional high-throughput screening. However, CPI-related data are usually distributed across different institutions and their sharing is restricted because of data privacy and intellectual property rights. Constructing a scheme that enhances multi-institutional collaboration to improve prediction accuracy while protecting data privacy is essential. To this end, we propose FedKD-CPI, the first framework based on federated knowledge distillation, to effectively facilitate multi-party CPI collaborative prediction and ensure data privacy and security. FedKD-CPI uses knowledge distillation technology to extract the updated knowledge of all client models and train the model on the server to achieve knowledge aggregation, which can effectively utilize the knowledge contained in public and private data. We evaluate FedKD-CPI on three benchmark datasets and compare it with four baselines. The results show that FedKD-CPI is very close to centralized learning and significantly better than localized learning. Furthermore, FedKD-CPI outperforms federated learning-based baselines on independent and identically distributed data and non-independent and identically distributed data. Overall, FedKD-CPI improves the CPI prediction while ensuring data security and promoting institutions' collaboration to accelerate drug discovery.</div></div>","PeriodicalId":390,"journal":{"name":"Methods","volume":"234 ","pages":"Pages 275-283"},"PeriodicalIF":4.2000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Methods","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1046202325000076","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Compound-protein interaction (CPI) prediction is critical in the early stages of drug discovery, narrowing the search space for CPIs and reducing the cost and time required for traditional high-throughput screening. However, CPI-related data are usually distributed across different institutions and their sharing is restricted because of data privacy and intellectual property rights. Constructing a scheme that enhances multi-institutional collaboration to improve prediction accuracy while protecting data privacy is essential. To this end, we propose FedKD-CPI, the first framework based on federated knowledge distillation, to effectively facilitate multi-party CPI collaborative prediction and ensure data privacy and security. FedKD-CPI uses knowledge distillation technology to extract the updated knowledge of all client models and train the model on the server to achieve knowledge aggregation, which can effectively utilize the knowledge contained in public and private data. We evaluate FedKD-CPI on three benchmark datasets and compare it with four baselines. The results show that FedKD-CPI is very close to centralized learning and significantly better than localized learning. Furthermore, FedKD-CPI outperforms federated learning-based baselines on independent and identically distributed data and non-independent and identically distributed data. Overall, FedKD-CPI improves the CPI prediction while ensuring data security and promoting institutions' collaboration to accelerate drug discovery.
FedKD-CPI:结合联邦知识蒸馏技术实现化合物-蛋白质相互作用协同预测。
化合物-蛋白质相互作用(CPI)预测在药物发现的早期阶段至关重要,它缩小了CPI的搜索空间,减少了传统高通量筛选所需的成本和时间。然而,cpi相关数据通常分布在不同的机构之间,由于数据隐私和知识产权的原因,它们的共享受到限制。构建一个方案,加强多机构协作,提高预测精度,同时保护数据隐私是必不可少的。为此,我们提出了首个基于联邦知识蒸馏的框架FedKD-CPI,有效促进多方CPI协同预测,保证数据的隐私性和安全性。FedKD-CPI采用知识蒸馏技术提取所有客户端模型的更新知识,并在服务器端对模型进行训练,实现知识聚合,可以有效地利用公共和私有数据中包含的知识。我们在三个基准数据集上评估了FedKD-CPI,并将其与四个基线进行了比较。结果表明,FedKD-CPI非常接近集中式学习,明显优于局部学习。此外,FedKD-CPI在独立和相同分布的数据以及非独立和相同分布的数据上优于基于联邦学习的基线。总体而言,FedKD-CPI在提高CPI预测的同时,确保了数据的安全性,促进了机构间的合作,加速了药物的发现。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Methods
Methods 生物-生化研究方法
CiteScore
9.80
自引率
2.10%
发文量
222
审稿时长
11.3 weeks
期刊介绍: Methods focuses on rapidly developing techniques in the experimental biological and medical sciences. Each topical issue, organized by a guest editor who is an expert in the area covered, consists solely of invited quality articles by specialist authors, many of them reviews. Issues are devoted to specific technical approaches with emphasis on clear detailed descriptions of protocols that allow them to be reproduced easily. The background information provided enables researchers to understand the principles underlying the methods; other helpful sections include comparisons of alternative methods giving the advantages and disadvantages of particular methods, guidance on avoiding potential pitfalls, and suggestions for troubleshooting.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信