Comprehensive encoding of conformational and compositional protein structural ensembles through the mmCIF data structure

IF 2.9 2区 材料科学 Q2 CHEMISTRY, MULTIDISCIPLINARY
IUCrJ Pub Date : 2024-07-01 DOI:10.1107/S2052252524005098
Stephanie A. Wankowicz , James S. Fraser , Z.-J. Liu (Editor)
{"title":"Comprehensive encoding of conformational and compositional protein structural ensembles through the mmCIF data structure","authors":"Stephanie A. Wankowicz ,&nbsp;James S. Fraser ,&nbsp;Z.-J. Liu (Editor)","doi":"10.1107/S2052252524005098","DOIUrl":null,"url":null,"abstract":"<div><p>Traditional structural models of biomolecules typically represent only a single conformational state, even though biomolecules naturally exist in multiple states crucial for their function. Here, we propose enhancements to the macromolecular crystallographic information file (mmCIF) to better capture the complex conformational and compositional heterogeneity of biomolecules that is human- and machine-interpretable.</p></div><div><p>In the folded state, biomolecules exchange between multiple conformational states crucial for their function. However, most structural models derived from experiments and computational predictions only encode a single state. To represent biomolecules accurately, we must move towards modeling and predicting structural ensembles. Information about structural ensembles exists within experimental data from X-ray crystallography and cryo-electron microscopy. Although new tools are available to detect conformational and compositional heterogeneity within these ensembles, the legacy PDB data structure does not robustly encapsulate this complexity. We propose modifications to the macromolecular crystallographic information file (mmCIF) to improve the representation and interrelation of conformational and compositional heterogeneity. These modifications will enable the capture of macromolecular ensembles in a human and machine-interpretable way, potentially catalyzing breakthroughs for ensemble–function predictions, analogous to the achievements of <em>AlphaFold</em> with single-structure prediction.</p></div>","PeriodicalId":14775,"journal":{"name":"IUCrJ","volume":"11 4","pages":"Pages 494-501"},"PeriodicalIF":2.9000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11220883/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IUCrJ","FirstCategoryId":"88","ListUrlMain":"https://www.sciencedirect.com/org/science/article/pii/S2052252524000423","RegionNum":2,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Traditional structural models of biomolecules typically represent only a single conformational state, even though biomolecules naturally exist in multiple states crucial for their function. Here, we propose enhancements to the macromolecular crystallographic information file (mmCIF) to better capture the complex conformational and compositional heterogeneity of biomolecules that is human- and machine-interpretable.

In the folded state, biomolecules exchange between multiple conformational states crucial for their function. However, most structural models derived from experiments and computational predictions only encode a single state. To represent biomolecules accurately, we must move towards modeling and predicting structural ensembles. Information about structural ensembles exists within experimental data from X-ray crystallography and cryo-electron microscopy. Although new tools are available to detect conformational and compositional heterogeneity within these ensembles, the legacy PDB data structure does not robustly encapsulate this complexity. We propose modifications to the macromolecular crystallographic information file (mmCIF) to improve the representation and interrelation of conformational and compositional heterogeneity. These modifications will enable the capture of macromolecular ensembles in a human and machine-interpretable way, potentially catalyzing breakthroughs for ensemble–function predictions, analogous to the achievements of AlphaFold with single-structure prediction.

通过 mmCIF 数据结构对构象和组成蛋白质结构组合进行全面编码。
在折叠状态下,生物大分子会在对其功能至关重要的多种构象状态之间进行交换。然而,从实验和计算预测中得出的大多数结构模型只能编码单一状态。为了准确地表达生物大分子,我们必须转向结构组合的建模和预测。有关结构组合的信息存在于 X 射线晶体学和冷冻电镜的实验数据中。虽然有新的工具可以检测这些结构簇中的构象和组成异质性,但传统的 PDB 数据结构并不能稳健地囊括这种复杂性。我们建议修改大分子晶体学信息文件(mmCIF),以改进构象和组成异质性的表示和相互关系。这些修改将能以人类和机器可理解的方式捕捉大分子集合,从而有可能推动集合功能预测的突破,类似于 AlphaFold 在单结构预测方面取得的成就。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IUCrJ
IUCrJ CHEMISTRY, MULTIDISCIPLINARYCRYSTALLOGRAPH-CRYSTALLOGRAPHY
CiteScore
7.50
自引率
5.10%
发文量
95
审稿时长
10 weeks
期刊介绍: IUCrJ is a new fully open-access peer-reviewed journal from the International Union of Crystallography (IUCr). The journal will publish high-profile articles on all aspects of the sciences and technologies supported by the IUCr via its commissions, including emerging fields where structural results underpin the science reported in the article. Our aim is to make IUCrJ the natural home for high-quality structural science results. Chemists, biologists, physicists and material scientists will be actively encouraged to report their structural studies in IUCrJ.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信