Large-scale protein clustering in the age of deep learning

IF 6.1 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY
Joana Pereira , Lorenzo Pantolini , Janani Durairaj , Torsten Schwede
{"title":"Large-scale protein clustering in the age of deep learning","authors":"Joana Pereira ,&nbsp;Lorenzo Pantolini ,&nbsp;Janani Durairaj ,&nbsp;Torsten Schwede","doi":"10.1016/j.sbi.2025.103078","DOIUrl":null,"url":null,"abstract":"<div><div>Proteins within a family sharing sequence and structure similarity due to a common evolutionary origin often also share functional similarities. Clustering of proteins therefore offers valuable insights, enabling the transfer of features and annotations from well-studied proteins to less-investigated ones. On a local scale, clustering helps identify patterns within specific protein families. On a larger scale, it provides insights into the entire protein universe, showcasing relationships that may not be immediately apparent. Traditionally, this was done at the sequence level or with the use of experimentally resolved protein structures, but the advent of deep learning in protein bioinformatics has brought new options to the table, increasing the breadth, depth, and diversity of similarity metrics and clustering approaches.</div></div>","PeriodicalId":10887,"journal":{"name":"Current opinion in structural biology","volume":"94 ","pages":"Article 103078"},"PeriodicalIF":6.1000,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Current opinion in structural biology","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0959440X2500096X","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Proteins within a family sharing sequence and structure similarity due to a common evolutionary origin often also share functional similarities. Clustering of proteins therefore offers valuable insights, enabling the transfer of features and annotations from well-studied proteins to less-investigated ones. On a local scale, clustering helps identify patterns within specific protein families. On a larger scale, it provides insights into the entire protein universe, showcasing relationships that may not be immediately apparent. Traditionally, this was done at the sequence level or with the use of experimentally resolved protein structures, but the advent of deep learning in protein bioinformatics has brought new options to the table, increasing the breadth, depth, and diversity of similarity metrics and clustering approaches.
深度学习时代的大规模蛋白质聚类
由于共同的进化起源,家族中具有序列和结构相似性的蛋白质通常也具有功能相似性。因此,蛋白质的聚类提供了有价值的见解,使特征和注释从研究充分的蛋白质转移到研究较少的蛋白质。在局部范围内,聚类有助于识别特定蛋白质家族的模式。在更大的范围内,它提供了对整个蛋白质宇宙的见解,展示了可能不会立即显现的关系。传统上,这是在序列水平上完成的,或者使用实验解决的蛋白质结构,但是蛋白质生物信息学中深度学习的出现带来了新的选择,增加了相似性度量和聚类方法的广度、深度和多样性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Current opinion in structural biology
Current opinion in structural biology 生物-生化与分子生物学
CiteScore
12.20
自引率
2.90%
发文量
179
审稿时长
6-12 weeks
期刊介绍: Current Opinion in Structural Biology (COSB) aims to stimulate scientifically grounded, interdisciplinary, multi-scale debate and exchange of ideas. It contains polished, concise and timely reviews and opinions, with particular emphasis on those articles published in the past two years. In addition to describing recent trends, the authors are encouraged to give their subjective opinion of the topics discussed. In COSB, we help the reader by providing in a systematic manner: 1. The views of experts on current advances in their field in a clear and readable form. 2. Evaluations of the most interesting papers, annotated by experts, from the great wealth of original publications. [...] The subject of Structural Biology is divided into twelve themed sections, each of which is reviewed once a year. Each issue contains two sections, and the amount of space devoted to each section is related to its importance. -Folding and Binding- Nucleic acids and their protein complexes- Macromolecular Machines- Theory and Simulation- Sequences and Topology- New constructs and expression of proteins- Membranes- Engineering and Design- Carbohydrate-protein interactions and glycosylation- Biophysical and molecular biological methods- Multi-protein assemblies in signalling- Catalysis and Regulation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信