{"title":"FoldExplorer:快速和准确的蛋白质结构搜索与序列增强图嵌入。","authors":"Yuan Liu, Ying Zhang, Zhen Zhou, Hong-Bin Shen","doi":"10.1016/j.jmb.2025.169412","DOIUrl":null,"url":null,"abstract":"<div><div>The advent of highly accurate protein structure prediction methods has fueled an exponential expansion of the protein structure database. Consequently, there is a rising demand for rapid and precise protein structure search. Traditional alignment-based methods are designed for precise pairwise comparisons, offering high accuracy. However, they face challenges when searching within large databases. In response to this challenge, we propose a novel deep-learning approach FoldExplorer. It leverages graph attention neural networks and protein language models to jointly encode structural and sequence information, generating embeddings tailored for protein structure search. FoldExplorer achieves competitive performance in geometric similarity search and classification tasks, outperforming recent deep learning and sequence-based methods, and approaching classical alignment tools. Significantly, FoldExplorer remains effective when searching low-confidence predicted structures. Meanwhile, FoldExplorer is particularly highly efficient when searching in large-scale databases. The accurate embedding space generated by FoldExplorer enables providing a comprehensive protein structure space view, which will provide novel cluster and boundary insights on the protein universe studies. A publicly accessible search web server is available at: <span><span>http://www.csbio.sjtu.edu.cn/bioinf/FoldExplorer/</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":"437 21","pages":"Article 169412"},"PeriodicalIF":4.5000,"publicationDate":"2025-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"FoldExplorer: Fast and Accurate Protein Structure Search with Sequence-Enhanced Graph Embedding\",\"authors\":\"Yuan Liu, Ying Zhang, Zhen Zhou, Hong-Bin Shen\",\"doi\":\"10.1016/j.jmb.2025.169412\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The advent of highly accurate protein structure prediction methods has fueled an exponential expansion of the protein structure database. Consequently, there is a rising demand for rapid and precise protein structure search. Traditional alignment-based methods are designed for precise pairwise comparisons, offering high accuracy. However, they face challenges when searching within large databases. In response to this challenge, we propose a novel deep-learning approach FoldExplorer. It leverages graph attention neural networks and protein language models to jointly encode structural and sequence information, generating embeddings tailored for protein structure search. FoldExplorer achieves competitive performance in geometric similarity search and classification tasks, outperforming recent deep learning and sequence-based methods, and approaching classical alignment tools. Significantly, FoldExplorer remains effective when searching low-confidence predicted structures. Meanwhile, FoldExplorer is particularly highly efficient when searching in large-scale databases. The accurate embedding space generated by FoldExplorer enables providing a comprehensive protein structure space view, which will provide novel cluster and boundary insights on the protein universe studies. A publicly accessible search web server is available at: <span><span>http://www.csbio.sjtu.edu.cn/bioinf/FoldExplorer/</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":369,\"journal\":{\"name\":\"Journal of Molecular Biology\",\"volume\":\"437 21\",\"pages\":\"Article 169412\"},\"PeriodicalIF\":4.5000,\"publicationDate\":\"2025-08-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Molecular Biology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0022283625004784\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Molecular Biology","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0022283625004784","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
FoldExplorer: Fast and Accurate Protein Structure Search with Sequence-Enhanced Graph Embedding
The advent of highly accurate protein structure prediction methods has fueled an exponential expansion of the protein structure database. Consequently, there is a rising demand for rapid and precise protein structure search. Traditional alignment-based methods are designed for precise pairwise comparisons, offering high accuracy. However, they face challenges when searching within large databases. In response to this challenge, we propose a novel deep-learning approach FoldExplorer. It leverages graph attention neural networks and protein language models to jointly encode structural and sequence information, generating embeddings tailored for protein structure search. FoldExplorer achieves competitive performance in geometric similarity search and classification tasks, outperforming recent deep learning and sequence-based methods, and approaching classical alignment tools. Significantly, FoldExplorer remains effective when searching low-confidence predicted structures. Meanwhile, FoldExplorer is particularly highly efficient when searching in large-scale databases. The accurate embedding space generated by FoldExplorer enables providing a comprehensive protein structure space view, which will provide novel cluster and boundary insights on the protein universe studies. A publicly accessible search web server is available at: http://www.csbio.sjtu.edu.cn/bioinf/FoldExplorer/.
期刊介绍:
Journal of Molecular Biology (JMB) provides high quality, comprehensive and broad coverage in all areas of molecular biology. The journal publishes original scientific research papers that provide mechanistic and functional insights and report a significant advance to the field. The journal encourages the submission of multidisciplinary studies that use complementary experimental and computational approaches to address challenging biological questions.
Research areas include but are not limited to: Biomolecular interactions, signaling networks, systems biology; Cell cycle, cell growth, cell differentiation; Cell death, autophagy; Cell signaling and regulation; Chemical biology; Computational biology, in combination with experimental studies; DNA replication, repair, and recombination; Development, regenerative biology, mechanistic and functional studies of stem cells; Epigenetics, chromatin structure and function; Gene expression; Membrane processes, cell surface proteins and cell-cell interactions; Methodological advances, both experimental and theoretical, including databases; Microbiology, virology, and interactions with the host or environment; Microbiota mechanistic and functional studies; Nuclear organization; Post-translational modifications, proteomics; Processing and function of biologically important macromolecules and complexes; Molecular basis of disease; RNA processing, structure and functions of non-coding RNAs, transcription; Sorting, spatiotemporal organization, trafficking; Structural biology; Synthetic biology; Translation, protein folding, chaperones, protein degradation and quality control.