Perspective: on the importance of extensive, high-quality and reliable deposition of biomolecular NMR data in the age of artificial intelligence

IF 1.3 3区 生物学 Q3 BIOCHEMISTRY & MOLECULAR BIOLOGY
Victoria A. Higman, Eliza Płoskoń, Gary S. Thompson, Geerten W. Vuister
{"title":"Perspective: on the importance of extensive, high-quality and reliable deposition of biomolecular NMR data in the age of artificial intelligence","authors":"Victoria A. Higman,&nbsp;Eliza Płoskoń,&nbsp;Gary S. Thompson,&nbsp;Geerten W. Vuister","doi":"10.1007/s10858-024-00451-w","DOIUrl":null,"url":null,"abstract":"<div><p>Artificial intelligence (AI) models are revolutionising scientific data analysis but are reliant on large training data sets. While artificial training data can be used in the context of NMR processing and data analysis methods, relating NMR parameters back to protein sequence and structure requires experimental data. In this perspective we examine what the biological NMR community needs to do, in order to store and share its data better so that we can make effective use of AI methods to further our understanding of biological molecules. We argue, first, that the community should be depositing much more of its experimental data. In particular, we should be depositing more spectra and dynamics data. Second, the NMR data deposited needs to capture the full information content required to be able to use and validate it adequately. The NMR Exchange Format (NEF) was designed several years ago to do this. The widespread adoption of NEF combined with a new proposal for dynamics data specifications come at the right time for the community to expand its deposition of data. Third, we highlight the importance of expanding and safeguarding our experimental data repository, the Biological Magnetic Resonance Data Bank (BMRB), not only in the interests of NMR spectroscopists, but biological scientists more widely. With this article we invite others in the biological NMR community to champion increased (possibly mandatory) data deposition, to get involved in designing new NEF specifications, and to advocate on behalf of the BMRB within the wider scientific community.</p></div>","PeriodicalId":613,"journal":{"name":"Journal of Biomolecular NMR","volume":"78 4","pages":"193 - 197"},"PeriodicalIF":1.3000,"publicationDate":"2024-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10858-024-00451-w.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Biomolecular NMR","FirstCategoryId":"99","ListUrlMain":"https://link.springer.com/article/10.1007/s10858-024-00451-w","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Artificial intelligence (AI) models are revolutionising scientific data analysis but are reliant on large training data sets. While artificial training data can be used in the context of NMR processing and data analysis methods, relating NMR parameters back to protein sequence and structure requires experimental data. In this perspective we examine what the biological NMR community needs to do, in order to store and share its data better so that we can make effective use of AI methods to further our understanding of biological molecules. We argue, first, that the community should be depositing much more of its experimental data. In particular, we should be depositing more spectra and dynamics data. Second, the NMR data deposited needs to capture the full information content required to be able to use and validate it adequately. The NMR Exchange Format (NEF) was designed several years ago to do this. The widespread adoption of NEF combined with a new proposal for dynamics data specifications come at the right time for the community to expand its deposition of data. Third, we highlight the importance of expanding and safeguarding our experimental data repository, the Biological Magnetic Resonance Data Bank (BMRB), not only in the interests of NMR spectroscopists, but biological scientists more widely. With this article we invite others in the biological NMR community to champion increased (possibly mandatory) data deposition, to get involved in designing new NEF specifications, and to advocate on behalf of the BMRB within the wider scientific community.

视角:人工智能时代广泛、高质量和可靠地存储生物分子核磁共振数据的重要性。
人工智能(AI)模型正在彻底改变科学数据分析,但它依赖于大量的训练数据集。虽然人工训练数据可用于核磁共振处理和数据分析方法,但将核磁共振参数与蛋白质序列和结构联系起来需要实验数据。在这一视角中,我们将探讨生物 NMR 界需要做些什么,以便更好地存储和共享数据,从而有效利用人工智能方法来加深我们对生物分子的理解。首先,我们认为该领域应该存储更多的实验数据。尤其是,我们应该交存更多的光谱和动力学数据。其次,交存的 NMR 数据需要捕获所需的全部信息内容,以便能够充分使用和验证这些数据。几年前设计的 NMR 交换格式 (NEF) 就是为了实现这一点。NEF 的广泛采用,加上关于动态数据规范的新建议,恰逢其时地促进了社区扩大数据沉积。第三,我们强调扩大和保护我们的实验数据存储库--生物磁共振数据库 (BMRB) 的重要性,这不仅符合核磁共振光谱学家的利益,也符合更多生物科学家的利益。通过这篇文章,我们邀请生物 NMR 界的其他人士支持增加(可能是强制性的)数据存储,参与设计新的 NEF 规范,并在更广泛的科学界代表 BMRB 进行宣传。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Biomolecular NMR
Journal of Biomolecular NMR 生物-光谱学
CiteScore
6.00
自引率
3.70%
发文量
19
审稿时长
6-12 weeks
期刊介绍: The Journal of Biomolecular NMR provides a forum for publishing research on technical developments and innovative applications of nuclear magnetic resonance spectroscopy for the study of structure and dynamic properties of biopolymers in solution, liquid crystals, solids and mixed environments, e.g., attached to membranes. This may include: Three-dimensional structure determination of biological macromolecules (polypeptides/proteins, DNA, RNA, oligosaccharides) by NMR. New NMR techniques for studies of biological macromolecules. Novel approaches to computer-aided automated analysis of multidimensional NMR spectra. Computational methods for the structural interpretation of NMR data, including structure refinement. Comparisons of structures determined by NMR with those obtained by other methods, e.g. by diffraction techniques with protein single crystals. New techniques of sample preparation for NMR experiments (biosynthetic and chemical methods for isotope labeling, preparation of nutrients for biosynthetic isotope labeling, etc.). An NMR characterization of the products must be included.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信