利用多任务机器学习从常规一维核磁共振波谱中准确高效地阐明结构

IF 12.7 1区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY
Frank Hu, Michael S. Chen, Grant M. Rotskoff*, Matthew W. Kanan* and Thomas E. Markland*, 
{"title":"利用多任务机器学习从常规一维核磁共振波谱中准确高效地阐明结构","authors":"Frank Hu,&nbsp;Michael S. Chen,&nbsp;Grant M. Rotskoff*,&nbsp;Matthew W. Kanan* and Thomas E. Markland*,&nbsp;","doi":"10.1021/acscentsci.4c0113210.1021/acscentsci.4c01132","DOIUrl":null,"url":null,"abstract":"<p >Rapid determination of molecular structures can greatly accelerate workflows across many chemical disciplines. However, elucidating structure using only one-dimensional (1D) NMR spectra, the most readily accessible data, remains an extremely challenging problem because of the combinatorial explosion of the number of possible molecules as the number of constituent atoms is increased. Here, we introduce a multitask machine learning framework that predicts the molecular structure (formula and connectivity) of an unknown compound solely based on its 1D <sup>1</sup>H and/or <sup>13</sup>C NMR spectra. First, we show how a transformer architecture can be constructed to efficiently solve the task, traditionally performed by chemists, of assembling large numbers of molecular fragments into molecular structures. Integrating this capability with a convolutional neural network, we build an end-to-end model for predicting structure from spectra that is fast and accurate. We demonstrate the effectiveness of this framework on molecules with up to 19 heavy (non-hydrogen) atoms, a size for which there are trillions of possible structures. Without relying on any prior chemical knowledge such as the molecular formula, we show that our approach predicts the exact molecule 69.6% of the time within the first 15 predictions, reducing the search space by up to 11 orders of magnitude.</p><p >We introduce a multitask machine learning framework that rapidly predicts both the molecular structure and molecular fragments of an unknown compound using only one-dimensional <sup>1</sup>H and <sup>13</sup>C NMR spectra.</p>","PeriodicalId":10,"journal":{"name":"ACS Central Science","volume":"10 11","pages":"2162–2170 2162–2170"},"PeriodicalIF":12.7000,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.acs.org/doi/epdf/10.1021/acscentsci.4c01132","citationCount":"0","resultStr":"{\"title\":\"Accurate and Efficient Structure Elucidation from Routine One-Dimensional NMR Spectra Using Multitask Machine Learning\",\"authors\":\"Frank Hu,&nbsp;Michael S. Chen,&nbsp;Grant M. Rotskoff*,&nbsp;Matthew W. Kanan* and Thomas E. Markland*,&nbsp;\",\"doi\":\"10.1021/acscentsci.4c0113210.1021/acscentsci.4c01132\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >Rapid determination of molecular structures can greatly accelerate workflows across many chemical disciplines. However, elucidating structure using only one-dimensional (1D) NMR spectra, the most readily accessible data, remains an extremely challenging problem because of the combinatorial explosion of the number of possible molecules as the number of constituent atoms is increased. Here, we introduce a multitask machine learning framework that predicts the molecular structure (formula and connectivity) of an unknown compound solely based on its 1D <sup>1</sup>H and/or <sup>13</sup>C NMR spectra. First, we show how a transformer architecture can be constructed to efficiently solve the task, traditionally performed by chemists, of assembling large numbers of molecular fragments into molecular structures. Integrating this capability with a convolutional neural network, we build an end-to-end model for predicting structure from spectra that is fast and accurate. We demonstrate the effectiveness of this framework on molecules with up to 19 heavy (non-hydrogen) atoms, a size for which there are trillions of possible structures. Without relying on any prior chemical knowledge such as the molecular formula, we show that our approach predicts the exact molecule 69.6% of the time within the first 15 predictions, reducing the search space by up to 11 orders of magnitude.</p><p >We introduce a multitask machine learning framework that rapidly predicts both the molecular structure and molecular fragments of an unknown compound using only one-dimensional <sup>1</sup>H and <sup>13</sup>C NMR spectra.</p>\",\"PeriodicalId\":10,\"journal\":{\"name\":\"ACS Central Science\",\"volume\":\"10 11\",\"pages\":\"2162–2170 2162–2170\"},\"PeriodicalIF\":12.7000,\"publicationDate\":\"2024-11-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://pubs.acs.org/doi/epdf/10.1021/acscentsci.4c01132\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Central Science\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://pubs.acs.org/doi/10.1021/acscentsci.4c01132\",\"RegionNum\":1,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Central Science","FirstCategoryId":"92","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acscentsci.4c01132","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

快速确定分子结构可以大大加快许多化学学科的工作流程。然而,仅使用一维(1D)核磁共振光谱(最容易获取的数据)来阐明结构仍然是一个极具挑战性的问题,因为随着组成原子数量的增加,可能的分子数量会发生组合爆炸。在此,我们介绍一种多任务机器学习框架,它能仅根据一维 1H 和/或 13C NMR 光谱预测未知化合物的分子结构(分子式和连接性)。首先,我们展示了如何构建转换器架构,以高效解决传统上由化学家完成的将大量分子片段组装成分子结构的任务。将这一功能与卷积神经网络相结合,我们建立了一个端到端模型,可快速、准确地从光谱预测结构。我们在含有多达 19 个重(非氢)原子的分子上演示了这一框架的有效性,这种大小的分子有数万亿种可能的结构。我们介绍了一种多任务机器学习框架,它能仅利用一维 1H 和 13C NMR 光谱快速预测未知化合物的分子结构和分子片段。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Accurate and Efficient Structure Elucidation from Routine One-Dimensional NMR Spectra Using Multitask Machine Learning

Rapid determination of molecular structures can greatly accelerate workflows across many chemical disciplines. However, elucidating structure using only one-dimensional (1D) NMR spectra, the most readily accessible data, remains an extremely challenging problem because of the combinatorial explosion of the number of possible molecules as the number of constituent atoms is increased. Here, we introduce a multitask machine learning framework that predicts the molecular structure (formula and connectivity) of an unknown compound solely based on its 1D 1H and/or 13C NMR spectra. First, we show how a transformer architecture can be constructed to efficiently solve the task, traditionally performed by chemists, of assembling large numbers of molecular fragments into molecular structures. Integrating this capability with a convolutional neural network, we build an end-to-end model for predicting structure from spectra that is fast and accurate. We demonstrate the effectiveness of this framework on molecules with up to 19 heavy (non-hydrogen) atoms, a size for which there are trillions of possible structures. Without relying on any prior chemical knowledge such as the molecular formula, we show that our approach predicts the exact molecule 69.6% of the time within the first 15 predictions, reducing the search space by up to 11 orders of magnitude.

We introduce a multitask machine learning framework that rapidly predicts both the molecular structure and molecular fragments of an unknown compound using only one-dimensional 1H and 13C NMR spectra.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
ACS Central Science
ACS Central Science Chemical Engineering-General Chemical Engineering
CiteScore
25.50
自引率
0.50%
发文量
194
审稿时长
10 weeks
期刊介绍: ACS Central Science publishes significant primary reports on research in chemistry and allied fields where chemical approaches are pivotal. As the first fully open-access journal by the American Chemical Society, it covers compelling and important contributions to the broad chemistry and scientific community. "Central science," a term popularized nearly 40 years ago, emphasizes chemistry's central role in connecting physical and life sciences, and fundamental sciences with applied disciplines like medicine and engineering. The journal focuses on exceptional quality articles, addressing advances in fundamental chemistry and interdisciplinary research.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信