FoldExplorer:快速和准确的蛋白质结构搜索与序列增强图嵌入。

IF 4.5 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY
Yuan Liu, Ying Zhang, Zhen Zhou, Hong-Bin Shen
{"title":"FoldExplorer:快速和准确的蛋白质结构搜索与序列增强图嵌入。","authors":"Yuan Liu,&nbsp;Ying Zhang,&nbsp;Zhen Zhou,&nbsp;Hong-Bin Shen","doi":"10.1016/j.jmb.2025.169412","DOIUrl":null,"url":null,"abstract":"<div><div>The advent of highly accurate protein structure prediction methods has fueled an exponential expansion of the protein structure database. Consequently, there is a rising demand for rapid and precise protein structure search. Traditional alignment-based methods are designed for precise pairwise comparisons, offering high accuracy. However, they face challenges when searching within large databases. In response to this challenge, we propose a novel deep-learning approach FoldExplorer. It leverages graph attention neural networks and protein language models to jointly encode structural and sequence information, generating embeddings tailored for protein structure search. FoldExplorer achieves competitive performance in geometric similarity search and classification tasks, outperforming recent deep learning and sequence-based methods, and approaching classical alignment tools. Significantly, FoldExplorer remains effective when searching low-confidence predicted structures. Meanwhile, FoldExplorer is particularly highly efficient when searching in large-scale databases. The accurate embedding space generated by FoldExplorer enables providing a comprehensive protein structure space view, which will provide novel cluster and boundary insights on the protein universe studies. A publicly accessible search web server is available at: <span><span>http://www.csbio.sjtu.edu.cn/bioinf/FoldExplorer/</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":"437 21","pages":"Article 169412"},"PeriodicalIF":4.5000,"publicationDate":"2025-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"FoldExplorer: Fast and Accurate Protein Structure Search with Sequence-Enhanced Graph Embedding\",\"authors\":\"Yuan Liu,&nbsp;Ying Zhang,&nbsp;Zhen Zhou,&nbsp;Hong-Bin Shen\",\"doi\":\"10.1016/j.jmb.2025.169412\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The advent of highly accurate protein structure prediction methods has fueled an exponential expansion of the protein structure database. Consequently, there is a rising demand for rapid and precise protein structure search. Traditional alignment-based methods are designed for precise pairwise comparisons, offering high accuracy. However, they face challenges when searching within large databases. In response to this challenge, we propose a novel deep-learning approach FoldExplorer. It leverages graph attention neural networks and protein language models to jointly encode structural and sequence information, generating embeddings tailored for protein structure search. FoldExplorer achieves competitive performance in geometric similarity search and classification tasks, outperforming recent deep learning and sequence-based methods, and approaching classical alignment tools. Significantly, FoldExplorer remains effective when searching low-confidence predicted structures. Meanwhile, FoldExplorer is particularly highly efficient when searching in large-scale databases. The accurate embedding space generated by FoldExplorer enables providing a comprehensive protein structure space view, which will provide novel cluster and boundary insights on the protein universe studies. A publicly accessible search web server is available at: <span><span>http://www.csbio.sjtu.edu.cn/bioinf/FoldExplorer/</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":369,\"journal\":{\"name\":\"Journal of Molecular Biology\",\"volume\":\"437 21\",\"pages\":\"Article 169412\"},\"PeriodicalIF\":4.5000,\"publicationDate\":\"2025-08-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Molecular Biology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0022283625004784\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Molecular Biology","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0022283625004784","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

高精度蛋白质结构预测方法的出现推动了蛋白质结构数据库的指数级扩展。因此,对快速、精确的蛋白质结构搜索的需求日益增长。传统的基于对准的方法是为精确的两两比较而设计的,提供了很高的准确性。然而,当在大型数据库中进行搜索时,它们面临着挑战。为了应对这一挑战,我们提出了一种新的深度学习方法FoldExplorer。它利用图注意神经网络和蛋白质语言模型共同编码结构和序列信息,生成适合蛋白质结构搜索的嵌入。FoldExplorer在几何相似性搜索和分类任务方面具有竞争力,优于最近的深度学习和基于序列的方法,并接近经典的对齐工具。值得注意的是,FoldExplorer在搜索低置信度预测结构时仍然有效。同时,在大型数据库中进行搜索时,FoldExplorer的效率特别高。FoldExplorer生成的精确嵌入空间能够提供一个全面的蛋白质结构空间视图,这将为蛋白质宇宙研究提供新的集群和边界见解。一个可公开访问的搜索网络服务器可在:http://www.csbio.sjtu.edu.cn/bioinf/FoldExplorer/。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

FoldExplorer: Fast and Accurate Protein Structure Search with Sequence-Enhanced Graph Embedding

FoldExplorer: Fast and Accurate Protein Structure Search with Sequence-Enhanced Graph Embedding
The advent of highly accurate protein structure prediction methods has fueled an exponential expansion of the protein structure database. Consequently, there is a rising demand for rapid and precise protein structure search. Traditional alignment-based methods are designed for precise pairwise comparisons, offering high accuracy. However, they face challenges when searching within large databases. In response to this challenge, we propose a novel deep-learning approach FoldExplorer. It leverages graph attention neural networks and protein language models to jointly encode structural and sequence information, generating embeddings tailored for protein structure search. FoldExplorer achieves competitive performance in geometric similarity search and classification tasks, outperforming recent deep learning and sequence-based methods, and approaching classical alignment tools. Significantly, FoldExplorer remains effective when searching low-confidence predicted structures. Meanwhile, FoldExplorer is particularly highly efficient when searching in large-scale databases. The accurate embedding space generated by FoldExplorer enables providing a comprehensive protein structure space view, which will provide novel cluster and boundary insights on the protein universe studies. A publicly accessible search web server is available at: http://www.csbio.sjtu.edu.cn/bioinf/FoldExplorer/.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Molecular Biology
Journal of Molecular Biology 生物-生化与分子生物学
CiteScore
11.30
自引率
1.80%
发文量
412
审稿时长
28 days
期刊介绍: Journal of Molecular Biology (JMB) provides high quality, comprehensive and broad coverage in all areas of molecular biology. The journal publishes original scientific research papers that provide mechanistic and functional insights and report a significant advance to the field. The journal encourages the submission of multidisciplinary studies that use complementary experimental and computational approaches to address challenging biological questions. Research areas include but are not limited to: Biomolecular interactions, signaling networks, systems biology; Cell cycle, cell growth, cell differentiation; Cell death, autophagy; Cell signaling and regulation; Chemical biology; Computational biology, in combination with experimental studies; DNA replication, repair, and recombination; Development, regenerative biology, mechanistic and functional studies of stem cells; Epigenetics, chromatin structure and function; Gene expression; Membrane processes, cell surface proteins and cell-cell interactions; Methodological advances, both experimental and theoretical, including databases; Microbiology, virology, and interactions with the host or environment; Microbiota mechanistic and functional studies; Nuclear organization; Post-translational modifications, proteomics; Processing and function of biologically important macromolecules and complexes; Molecular basis of disease; RNA processing, structure and functions of non-coding RNAs, transcription; Sorting, spatiotemporal organization, trafficking; Structural biology; Synthetic biology; Translation, protein folding, chaperones, protein degradation and quality control.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信