利用深度学习模型将核磁共振波谱图像匹配到目标化合物,加强化学反应监测。

IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL
ZiJing Tian, Yan Dai, Feng Hu, ZiHao Shen, HongLing Xu, HongWen Zhang, JinHang Xu, YuTing Hu, YanYan Diao, HongLin Li
{"title":"利用深度学习模型将核磁共振波谱图像匹配到目标化合物,加强化学反应监测。","authors":"ZiJing Tian, Yan Dai, Feng Hu, ZiHao Shen, HongLing Xu, HongWen Zhang, JinHang Xu, YuTing Hu, YanYan Diao, HongLin Li","doi":"10.1021/acs.jcim.4c00522","DOIUrl":null,"url":null,"abstract":"<p><p>In the synthetic laboratory, researchers typically rely on nuclear magnetic resonance (NMR) spectra to elucidate structures of synthesized products and confirm whether they match the desired target compounds. As chemical synthesis technology evolves toward intelligence and continuity, efficient computer-assisted structure elucidation (CASE) techniques are required to replace time-consuming manual analysis and provide the necessary speed. However, current CASE methods typically aim to derive precise chemical structures from spectroscopic data, yet they suffer from drawbacks such as low accuracy, high computational cost, and reliance on chemical libraries. In meticulously designed chemical synthesis reactions, researchers prioritize confirming the attainment of the target product based on NMR spectra, rather than focusing on identifying the specific product obtained. For this purpose, we innovatively developed a binary classification model, termed as MatCS, to directly predict the relationship between NMR spectra image (including <sup>1</sup>H NMR and <sup>13</sup>C NMR) and the molecular structure of the target compound. After evaluating various feature extraction methods, MatCS employs a combination of the Graph Attention Networks and Graph Convolutional Networks to learn the structural features of molecular graphs and the pretrained ResNet101 network with a Convolutional Block Attention Module to extract features from NMR spectra images. The results show that on a challenging Test<sub>sim</sub> data set, which poses difficulty in distinguishing spectra of similar molecular structures, MatCS achieves comprehensive evaluation metrics with an F1-score of 0.81 and an AUC value of 0.87. Simultaneously, it exhibited commendable performance on an external SDBS data set containing experimental NMR spectra, showcasing substantial potential for structural verification tasks in real automated chemical synthesis.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":null,"pages":null},"PeriodicalIF":5.6000,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing Chemical Reaction Monitoring with a Deep Learning Model for NMR Spectra Image Matching to Target Compounds.\",\"authors\":\"ZiJing Tian, Yan Dai, Feng Hu, ZiHao Shen, HongLing Xu, HongWen Zhang, JinHang Xu, YuTing Hu, YanYan Diao, HongLin Li\",\"doi\":\"10.1021/acs.jcim.4c00522\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>In the synthetic laboratory, researchers typically rely on nuclear magnetic resonance (NMR) spectra to elucidate structures of synthesized products and confirm whether they match the desired target compounds. As chemical synthesis technology evolves toward intelligence and continuity, efficient computer-assisted structure elucidation (CASE) techniques are required to replace time-consuming manual analysis and provide the necessary speed. However, current CASE methods typically aim to derive precise chemical structures from spectroscopic data, yet they suffer from drawbacks such as low accuracy, high computational cost, and reliance on chemical libraries. In meticulously designed chemical synthesis reactions, researchers prioritize confirming the attainment of the target product based on NMR spectra, rather than focusing on identifying the specific product obtained. For this purpose, we innovatively developed a binary classification model, termed as MatCS, to directly predict the relationship between NMR spectra image (including <sup>1</sup>H NMR and <sup>13</sup>C NMR) and the molecular structure of the target compound. After evaluating various feature extraction methods, MatCS employs a combination of the Graph Attention Networks and Graph Convolutional Networks to learn the structural features of molecular graphs and the pretrained ResNet101 network with a Convolutional Block Attention Module to extract features from NMR spectra images. The results show that on a challenging Test<sub>sim</sub> data set, which poses difficulty in distinguishing spectra of similar molecular structures, MatCS achieves comprehensive evaluation metrics with an F1-score of 0.81 and an AUC value of 0.87. Simultaneously, it exhibited commendable performance on an external SDBS data set containing experimental NMR spectra, showcasing substantial potential for structural verification tasks in real automated chemical synthesis.</p>\",\"PeriodicalId\":44,\"journal\":{\"name\":\"Journal of Chemical Information and Modeling \",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.6000,\"publicationDate\":\"2024-07-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Chemical Information and Modeling \",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1021/acs.jcim.4c00522\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/7/9 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MEDICINAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical Information and Modeling ","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/acs.jcim.4c00522","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/9 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"CHEMISTRY, MEDICINAL","Score":null,"Total":0}
引用次数: 0

摘要

在合成实验室中,研究人员通常依靠核磁共振(NMR)光谱来阐明合成产物的结构,并确认它们是否与所需的目标化合物相匹配。随着化学合成技术向智能化和连续性方向发展,需要高效的计算机辅助结构阐释(CASE)技术来取代耗时的人工分析,并提供必要的速度。然而,目前的 CASE 方法通常旨在从光谱数据中推导出精确的化学结构,但存在准确度低、计算成本高和依赖化学库等缺点。在精心设计的化学合成反应中,研究人员优先考虑的是根据核磁共振光谱确认目标产物的获得,而不是专注于识别获得的特定产物。为此,我们创新性地开发了一种二元分类模型,称为 MatCS,用于直接预测核磁共振波谱图像(包括 1H NMR 和 13C NMR)与目标化合物分子结构之间的关系。在对各种特征提取方法进行评估后,MatCS 采用图注意网络和图卷积网络的组合来学习分子图的结构特征,并采用带有卷积块注意模块的预训练 ResNet101 网络从核磁共振波谱图像中提取特征。结果表明,在难以区分类似分子结构光谱的具有挑战性的 Testsim 数据集上,MatCS 实现了综合评价指标,F1 分数为 0.81,AUC 值为 0.87。同时,它在包含实验核磁共振光谱的外部 SDBS 数据集上也表现出了值得称赞的性能,为实际自动化学合成中的结构验证任务展示了巨大的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Enhancing Chemical Reaction Monitoring with a Deep Learning Model for NMR Spectra Image Matching to Target Compounds.

Enhancing Chemical Reaction Monitoring with a Deep Learning Model for NMR Spectra Image Matching to Target Compounds.

In the synthetic laboratory, researchers typically rely on nuclear magnetic resonance (NMR) spectra to elucidate structures of synthesized products and confirm whether they match the desired target compounds. As chemical synthesis technology evolves toward intelligence and continuity, efficient computer-assisted structure elucidation (CASE) techniques are required to replace time-consuming manual analysis and provide the necessary speed. However, current CASE methods typically aim to derive precise chemical structures from spectroscopic data, yet they suffer from drawbacks such as low accuracy, high computational cost, and reliance on chemical libraries. In meticulously designed chemical synthesis reactions, researchers prioritize confirming the attainment of the target product based on NMR spectra, rather than focusing on identifying the specific product obtained. For this purpose, we innovatively developed a binary classification model, termed as MatCS, to directly predict the relationship between NMR spectra image (including 1H NMR and 13C NMR) and the molecular structure of the target compound. After evaluating various feature extraction methods, MatCS employs a combination of the Graph Attention Networks and Graph Convolutional Networks to learn the structural features of molecular graphs and the pretrained ResNet101 network with a Convolutional Block Attention Module to extract features from NMR spectra images. The results show that on a challenging Testsim data set, which poses difficulty in distinguishing spectra of similar molecular structures, MatCS achieves comprehensive evaluation metrics with an F1-score of 0.81 and an AUC value of 0.87. Simultaneously, it exhibited commendable performance on an external SDBS data set containing experimental NMR spectra, showcasing substantial potential for structural verification tasks in real automated chemical synthesis.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
9.80
自引率
10.70%
发文量
529
审稿时长
1.4 months
期刊介绍: The Journal of Chemical Information and Modeling publishes papers reporting new methodology and/or important applications in the fields of chemical informatics and molecular modeling. Specific topics include the representation and computer-based searching of chemical databases, molecular modeling, computer-aided molecular design of new materials, catalysts, or ligands, development of new computational methods or efficient algorithms for chemical software, and biopharmaceutical chemistry including analyses of biological activity and other issues related to drug discovery. Astute chemists, computer scientists, and information specialists look to this monthly’s insightful research studies, programming innovations, and software reviews to keep current with advances in this integral, multidisciplinary field. As a subscriber you’ll stay abreast of database search systems, use of graph theory in chemical problems, substructure search systems, pattern recognition and clustering, analysis of chemical and physical data, molecular modeling, graphics and natural language interfaces, bibliometric and citation analysis, and synthesis design and reactions databases.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信