SpatConv能够通过预训练的蛋白质语言模型和可解释的生物空间卷积准确预测蛋白质结合位点。

IF 11 1区 综合性期刊 Q1 Multidisciplinary
Research Pub Date : 2025-07-08 eCollection Date: 2025-01-01 DOI:10.34133/research.0773
Mingming Guan, Jiyun Han, Shizhuo Zhang, Hongyu Zheng, Juntao Liu
{"title":"SpatConv能够通过预训练的蛋白质语言模型和可解释的生物空间卷积准确预测蛋白质结合位点。","authors":"Mingming Guan, Jiyun Han, Shizhuo Zhang, Hongyu Zheng, Juntao Liu","doi":"10.34133/research.0773","DOIUrl":null,"url":null,"abstract":"<p><p>Protein interactions with molecules, such as other proteins, peptides, or small ligands, play a critical role in biological processes, and the identification of protein binding sites is crucial for understanding the mechanisms underlying diseases such as cancer. Traditional protein binding site prediction models usually extract residue features manually and then employ a graph or point-cloud-based architecture borrowed from other fields. Therefore, substantial information loss and limited learning ability cause them to fail to capture residue binding patterns. To solve these challenges, we introduce a general network that predicts the binding residues of proteins, peptides, and metal ions on proteins. SpatConv extracts sequence features from a pretrained large protein language model and structure features from a local coordinate framework. SpatConv learns residue binding patterns through a specially designed, graph-free bio-spatial convolution, which characterizes the complex spatial environments around the residues. After training and testing, SpatConv demonstrates great improvements over the state-of-the-art predictors and reveals novel biological insights into the relationship between binding sites and physicochemical properties. Notably, SpatConv exhibits robust performance across predicted and experimental structures, enhancing its reliability. Additionally, when applying it to the spike protein structure of severe acute respiratory syndrome coronavirus 2, SpatConv successfully identifies antibody binding sites and predicts potential binding regions, providing strong evidence supporting new drug development. A user-friendly online server for SpatConv is freely available at http://liulab.top/SpatConv/server.</p>","PeriodicalId":21120,"journal":{"name":"Research","volume":"8 ","pages":"0773"},"PeriodicalIF":11.0000,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12237623/pdf/","citationCount":"0","resultStr":"{\"title\":\"SpatConv Enables the Accurate Prediction of Protein Binding Sites by a Pretrained Protein Language Model and an Interpretable Bio-spatial Convolution.\",\"authors\":\"Mingming Guan, Jiyun Han, Shizhuo Zhang, Hongyu Zheng, Juntao Liu\",\"doi\":\"10.34133/research.0773\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Protein interactions with molecules, such as other proteins, peptides, or small ligands, play a critical role in biological processes, and the identification of protein binding sites is crucial for understanding the mechanisms underlying diseases such as cancer. Traditional protein binding site prediction models usually extract residue features manually and then employ a graph or point-cloud-based architecture borrowed from other fields. Therefore, substantial information loss and limited learning ability cause them to fail to capture residue binding patterns. To solve these challenges, we introduce a general network that predicts the binding residues of proteins, peptides, and metal ions on proteins. SpatConv extracts sequence features from a pretrained large protein language model and structure features from a local coordinate framework. SpatConv learns residue binding patterns through a specially designed, graph-free bio-spatial convolution, which characterizes the complex spatial environments around the residues. After training and testing, SpatConv demonstrates great improvements over the state-of-the-art predictors and reveals novel biological insights into the relationship between binding sites and physicochemical properties. Notably, SpatConv exhibits robust performance across predicted and experimental structures, enhancing its reliability. Additionally, when applying it to the spike protein structure of severe acute respiratory syndrome coronavirus 2, SpatConv successfully identifies antibody binding sites and predicts potential binding regions, providing strong evidence supporting new drug development. A user-friendly online server for SpatConv is freely available at http://liulab.top/SpatConv/server.</p>\",\"PeriodicalId\":21120,\"journal\":{\"name\":\"Research\",\"volume\":\"8 \",\"pages\":\"0773\"},\"PeriodicalIF\":11.0000,\"publicationDate\":\"2025-07-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12237623/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Research\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.34133/research.0773\",\"RegionNum\":1,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"Multidisciplinary\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.34133/research.0773","RegionNum":1,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"Multidisciplinary","Score":null,"Total":0}
引用次数: 0

摘要

蛋白质与分子(如其他蛋白质、多肽或小配体)的相互作用在生物过程中起着至关重要的作用,蛋白质结合位点的鉴定对于理解诸如癌症等疾病的潜在机制至关重要。传统的蛋白质结合位点预测模型通常是手工提取残基特征,然后采用借鉴其他领域的基于图或点云的架构。因此,大量的信息丢失和有限的学习能力导致它们无法捕获残留的绑定模式。为了解决这些挑战,我们引入了一个通用网络来预测蛋白质、肽和金属离子在蛋白质上的结合残基。SpatConv从预训练的大蛋白质语言模型中提取序列特征,从局部坐标框架中提取结构特征。SpatConv通过特殊设计的无图形生物空间卷积来学习残基结合模式,该卷积表征了残基周围复杂的空间环境。经过培训和测试,SpatConv展示了对最先进的预测器的巨大改进,并揭示了结合位点和物理化学性质之间关系的新的生物学见解。值得注意的是,SpatConv在预测和实验结构中都表现出稳健的性能,提高了其可靠性。此外,SpatConv将其应用于严重急性呼吸综合征冠状病毒2的刺突蛋白结构,成功识别抗体结合位点并预测潜在结合区域,为新药开发提供有力证据。用户友好的SpatConv在线服务器可在http://liulab.top/SpatConv/server免费获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
SpatConv Enables the Accurate Prediction of Protein Binding Sites by a Pretrained Protein Language Model and an Interpretable Bio-spatial Convolution.

Protein interactions with molecules, such as other proteins, peptides, or small ligands, play a critical role in biological processes, and the identification of protein binding sites is crucial for understanding the mechanisms underlying diseases such as cancer. Traditional protein binding site prediction models usually extract residue features manually and then employ a graph or point-cloud-based architecture borrowed from other fields. Therefore, substantial information loss and limited learning ability cause them to fail to capture residue binding patterns. To solve these challenges, we introduce a general network that predicts the binding residues of proteins, peptides, and metal ions on proteins. SpatConv extracts sequence features from a pretrained large protein language model and structure features from a local coordinate framework. SpatConv learns residue binding patterns through a specially designed, graph-free bio-spatial convolution, which characterizes the complex spatial environments around the residues. After training and testing, SpatConv demonstrates great improvements over the state-of-the-art predictors and reveals novel biological insights into the relationship between binding sites and physicochemical properties. Notably, SpatConv exhibits robust performance across predicted and experimental structures, enhancing its reliability. Additionally, when applying it to the spike protein structure of severe acute respiratory syndrome coronavirus 2, SpatConv successfully identifies antibody binding sites and predicts potential binding regions, providing strong evidence supporting new drug development. A user-friendly online server for SpatConv is freely available at http://liulab.top/SpatConv/server.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Research
Research Multidisciplinary-Multidisciplinary
CiteScore
13.40
自引率
3.60%
发文量
0
审稿时长
14 weeks
期刊介绍: Research serves as a global platform for academic exchange, collaboration, and technological advancements. This journal welcomes high-quality research contributions from any domain, with open arms to authors from around the globe. Comprising fundamental research in the life and physical sciences, Research also highlights significant findings and issues in engineering and applied science. The journal proudly features original research articles, reviews, perspectives, and editorials, fostering a diverse and dynamic scholarly environment.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信