RCSB Protein Data Bank: visualizing groups of experimentally determined PDB structures alongside computed structure models of proteins

IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY
J. Segura, Yana Rose, Chunxiao Bi, Jose M. Duarte, Stephen K. Burley, S. Bittrich
{"title":"RCSB Protein Data Bank: visualizing groups of experimentally determined PDB structures alongside computed structure models of proteins","authors":"J. Segura, Yana Rose, Chunxiao Bi, Jose M. Duarte, Stephen K. Burley, S. Bittrich","doi":"10.3389/fbinf.2023.1311287","DOIUrl":null,"url":null,"abstract":"Recent advances in Artificial Intelligence and Machine Learning (e.g., AlphaFold, RosettaFold, and ESMFold) enable prediction of three-dimensional (3D) protein structures from amino acid sequences alone at accuracies comparable to lower-resolution experimental methods. These tools have been employed to predict structures across entire proteomes and the results of large-scale metagenomic sequence studies, yielding an exponential increase in available biomolecular 3D structural information. Given the enormous volume of this newly computed biostructure data, there is an urgent need for robust tools to manage, search, cluster, and visualize large collections of structures. Equally important is the capability to efficiently summarize and visualize metadata, biological/biochemical annotations, and structural features, particularly when working with vast numbers of protein structures of both experimental origin from the Protein Data Bank (PDB) and computationally-predicted models. Moreover, researchers require advanced visualization techniques that support interactive exploration of multiple sequences and structural alignments. This paper introduces a suite of tools provided on the RCSB PDB research-focused web portal RCSB. org, tailor-made for efficient management, search, organization, and visualization of this burgeoning corpus of 3D macromolecular structure data.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"6 2","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fbinf.2023.1311287","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Recent advances in Artificial Intelligence and Machine Learning (e.g., AlphaFold, RosettaFold, and ESMFold) enable prediction of three-dimensional (3D) protein structures from amino acid sequences alone at accuracies comparable to lower-resolution experimental methods. These tools have been employed to predict structures across entire proteomes and the results of large-scale metagenomic sequence studies, yielding an exponential increase in available biomolecular 3D structural information. Given the enormous volume of this newly computed biostructure data, there is an urgent need for robust tools to manage, search, cluster, and visualize large collections of structures. Equally important is the capability to efficiently summarize and visualize metadata, biological/biochemical annotations, and structural features, particularly when working with vast numbers of protein structures of both experimental origin from the Protein Data Bank (PDB) and computationally-predicted models. Moreover, researchers require advanced visualization techniques that support interactive exploration of multiple sequences and structural alignments. This paper introduces a suite of tools provided on the RCSB PDB research-focused web portal RCSB. org, tailor-made for efficient management, search, organization, and visualization of this burgeoning corpus of 3D macromolecular structure data.
RCSB 蛋白质数据库:可视化实验确定的 PDB 结构组和计算得出的蛋白质结构模型
人工智能和机器学习的最新进展(例如,AlphaFold, rosettfold和ESMFold)能够仅从氨基酸序列预测三维(3D)蛋白质结构,其精度可与低分辨率实验方法相媲美。这些工具已被用于预测整个蛋白质组的结构和大规模宏基因组序列研究的结果,产生了可用的生物分子3D结构信息的指数增长。考虑到这些新计算的生物结构数据的巨大容量,迫切需要一个强大的工具来管理、搜索、聚类和可视化大量的结构集合。同样重要的是有效总结和可视化元数据、生物/生化注释和结构特征的能力,特别是当处理来自蛋白质数据库(PDB)和计算预测模型的大量实验来源的蛋白质结构时。此外,研究人员需要先进的可视化技术来支持多序列和结构比对的交互式探索。本文介绍了RCSB PDB研究门户网站RCSB上提供的一套工具。org,专为高效管理、搜索、组织和可视化这个新兴的3D大分子结构数据语料库而量身定制。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
2.60
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信