基于多源蛋白语言模型和挤压-激发注意机制的深度神经网络噬菌体宿主预测。

IF 6.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Peng Gao, Long Xu, Yuan Bai, Qiuzhen Lin, Junkai Ji, Lijia Ma
{"title":"基于多源蛋白语言模型和挤压-激发注意机制的深度神经网络噬菌体宿主预测。","authors":"Peng Gao, Long Xu, Yuan Bai, Qiuzhen Lin, Junkai Ji, Lijia Ma","doi":"10.1109/JBHI.2025.3582652","DOIUrl":null,"url":null,"abstract":"<p><p>Phage therapy (PT) has become a promising alternative for treating infections with the increase of antimicrobial resistance. PT utilizes phages to bind to specific receptors on bacterial surfaces via receptor-binding proteins (RBPs), enabling precise destruction of targeted hosts. In PT, a key issue is the phage host prediction (PHP), which tries to match therapeutic phages to pathogenic hosts. However, traditional PHP methods are often hindered by the time-consuming and expensive wet-lab experiments, while recent computational methods neglect the evolutionary diversity and local feature patterns of RBPs. In this article, we propose a novel deep neural network (called PHPRBP) for PHP based on phage RBPs. In PHPRBP, we first utilize pre-trained protein language models (i.e., ESM2 and ProtT5) to learn the multi-source embedding representations from these RBPs, revealing diverse and complementary features. Then, we employ an adaptive synthetic technique to augment minority class samples, addressing the data scarcity issue. Subsequently, we design a deep neural network architecture, which uses a convolutional neural network to capture local sequence features, and applies a squeeze-and-excitation attention mechanism to enhance the contribution of important features. Finally, a fully connected network is used for host prediction. Experimental results show that PHPRBP outperforms the state-of-the-arts in host prediction at both genus and species levels. The data and code of PHPRBP are available at https://github.com/a1678019300/PHPRBP.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7000,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Phage Host Prediction Using Deep Neural Network with Multi-source Protein Language Models and Squeeze-and-Excitation Attention Mechanism.\",\"authors\":\"Peng Gao, Long Xu, Yuan Bai, Qiuzhen Lin, Junkai Ji, Lijia Ma\",\"doi\":\"10.1109/JBHI.2025.3582652\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Phage therapy (PT) has become a promising alternative for treating infections with the increase of antimicrobial resistance. PT utilizes phages to bind to specific receptors on bacterial surfaces via receptor-binding proteins (RBPs), enabling precise destruction of targeted hosts. In PT, a key issue is the phage host prediction (PHP), which tries to match therapeutic phages to pathogenic hosts. However, traditional PHP methods are often hindered by the time-consuming and expensive wet-lab experiments, while recent computational methods neglect the evolutionary diversity and local feature patterns of RBPs. In this article, we propose a novel deep neural network (called PHPRBP) for PHP based on phage RBPs. In PHPRBP, we first utilize pre-trained protein language models (i.e., ESM2 and ProtT5) to learn the multi-source embedding representations from these RBPs, revealing diverse and complementary features. Then, we employ an adaptive synthetic technique to augment minority class samples, addressing the data scarcity issue. Subsequently, we design a deep neural network architecture, which uses a convolutional neural network to capture local sequence features, and applies a squeeze-and-excitation attention mechanism to enhance the contribution of important features. Finally, a fully connected network is used for host prediction. Experimental results show that PHPRBP outperforms the state-of-the-arts in host prediction at both genus and species levels. The data and code of PHPRBP are available at https://github.com/a1678019300/PHPRBP.</p>\",\"PeriodicalId\":13073,\"journal\":{\"name\":\"IEEE Journal of Biomedical and Health Informatics\",\"volume\":\"PP \",\"pages\":\"\"},\"PeriodicalIF\":6.7000,\"publicationDate\":\"2025-06-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Journal of Biomedical and Health Informatics\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1109/JBHI.2025.3582652\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Biomedical and Health Informatics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1109/JBHI.2025.3582652","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

随着抗菌素耐药性的增加,噬菌体治疗(PT)已成为治疗感染的一种有希望的替代方法。PT利用噬菌体通过受体结合蛋白(rbp)与细菌表面的特定受体结合,实现对目标宿主的精确破坏。在PT中,一个关键问题是噬菌体宿主预测(PHP),它试图将治疗噬菌体与致病宿主相匹配。然而,传统的PHP方法经常受到耗时和昂贵的湿实验室实验的阻碍,而最近的计算方法忽略了rbp的进化多样性和局部特征模式。在本文中,我们提出了一种基于噬菌体rbp的PHP深度神经网络(称为PHPRBP)。在PHPRBP中,我们首先利用预训练的蛋白质语言模型(即ESM2和ProtT5)从这些rbp中学习多源嵌入表征,揭示出多样性和互补性特征。然后,我们采用自适应合成技术来增加少数类样本,解决数据稀缺性问题。随后,我们设计了一种深度神经网络架构,该架构使用卷积神经网络捕获局部序列特征,并采用挤压-激励注意机制来增强重要特征的贡献。最后,利用全连通网络进行主机预测。实验结果表明,在属和种水平上,PHPRBP在宿主预测方面都优于目前最先进的方法。PHPRBP的数据和代码可在https://github.com/a1678019300/PHPRBP上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Phage Host Prediction Using Deep Neural Network with Multi-source Protein Language Models and Squeeze-and-Excitation Attention Mechanism.

Phage therapy (PT) has become a promising alternative for treating infections with the increase of antimicrobial resistance. PT utilizes phages to bind to specific receptors on bacterial surfaces via receptor-binding proteins (RBPs), enabling precise destruction of targeted hosts. In PT, a key issue is the phage host prediction (PHP), which tries to match therapeutic phages to pathogenic hosts. However, traditional PHP methods are often hindered by the time-consuming and expensive wet-lab experiments, while recent computational methods neglect the evolutionary diversity and local feature patterns of RBPs. In this article, we propose a novel deep neural network (called PHPRBP) for PHP based on phage RBPs. In PHPRBP, we first utilize pre-trained protein language models (i.e., ESM2 and ProtT5) to learn the multi-source embedding representations from these RBPs, revealing diverse and complementary features. Then, we employ an adaptive synthetic technique to augment minority class samples, addressing the data scarcity issue. Subsequently, we design a deep neural network architecture, which uses a convolutional neural network to capture local sequence features, and applies a squeeze-and-excitation attention mechanism to enhance the contribution of important features. Finally, a fully connected network is used for host prediction. Experimental results show that PHPRBP outperforms the state-of-the-arts in host prediction at both genus and species levels. The data and code of PHPRBP are available at https://github.com/a1678019300/PHPRBP.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Journal of Biomedical and Health Informatics
IEEE Journal of Biomedical and Health Informatics COMPUTER SCIENCE, INFORMATION SYSTEMS-COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
CiteScore
13.60
自引率
6.50%
发文量
1151
期刊介绍: IEEE Journal of Biomedical and Health Informatics publishes original papers presenting recent advances where information and communication technologies intersect with health, healthcare, life sciences, and biomedicine. Topics include acquisition, transmission, storage, retrieval, management, and analysis of biomedical and health information. The journal covers applications of information technologies in healthcare, patient monitoring, preventive care, early disease diagnosis, therapy discovery, and personalized treatment protocols. It explores electronic medical and health records, clinical information systems, decision support systems, medical and biological imaging informatics, wearable systems, body area/sensor networks, and more. Integration-related topics like interoperability, evidence-based medicine, and secure patient data are also addressed.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信