TransPeakNet for solvent-aware 2D NMR prediction via multi-task pre-training and unsupervised learning.

IF 5.9 2区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY
Yunrui Li, Hao Xu, Ambrish Kumar, Duo-Sheng Wang, Christian Heiss, Parastoo Azadi, Pengyu Hong
{"title":"TransPeakNet for solvent-aware 2D NMR prediction via multi-task pre-training and unsupervised learning.","authors":"Yunrui Li, Hao Xu, Ambrish Kumar, Duo-Sheng Wang, Christian Heiss, Parastoo Azadi, Pengyu Hong","doi":"10.1038/s42004-025-01455-9","DOIUrl":null,"url":null,"abstract":"<p><p>Nuclear Magnetic Resonance (NMR) spectroscopy is essential for revealing molecular structure, electronic environment, and dynamics. Accurate NMR shift prediction allows researchers to validate structures by comparing predicted and observed shifts. While Machine Learning (ML) has improved one-dimensional (1D) NMR shift prediction, predicting 2D NMR remains challenging due to limited annotated data. To address this, we introduce an unsupervised training framework for predicting cross-peaks in 2D NMR, specifically Heteronuclear Single Quantum Coherence (HSQC). Our approach pretrains an ML model on an annotated 1D dataset of <sup>1</sup>H and <sup>13</sup>C shifts, then finetunes it in an unsupervised manner using unlabeled HSQC data, which simultaneously generates cross-peak annotations. Our model also adjusts for solvent effects. Evaluation on 479 expert-annotated HSQC spectra demonstrates our model's superiority over traditional methods (ChemDraw and Mestrenova), achieving Mean Absolute Errors (MAEs) of 2.05 ppm and 0.165 ppm for <sup>13</sup>C shifts and <sup>1</sup>H shifts respectively. Our algorithmic annotations show a 95.21% concordance with experts' assignments, underscoring the approach's potential for structural elucidation in fields like organic chemistry, pharmaceuticals, and natural products.</p>","PeriodicalId":10529,"journal":{"name":"Communications Chemistry","volume":"8 1","pages":"51"},"PeriodicalIF":5.9000,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11842623/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communications Chemistry","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1038/s42004-025-01455-9","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Nuclear Magnetic Resonance (NMR) spectroscopy is essential for revealing molecular structure, electronic environment, and dynamics. Accurate NMR shift prediction allows researchers to validate structures by comparing predicted and observed shifts. While Machine Learning (ML) has improved one-dimensional (1D) NMR shift prediction, predicting 2D NMR remains challenging due to limited annotated data. To address this, we introduce an unsupervised training framework for predicting cross-peaks in 2D NMR, specifically Heteronuclear Single Quantum Coherence (HSQC). Our approach pretrains an ML model on an annotated 1D dataset of 1H and 13C shifts, then finetunes it in an unsupervised manner using unlabeled HSQC data, which simultaneously generates cross-peak annotations. Our model also adjusts for solvent effects. Evaluation on 479 expert-annotated HSQC spectra demonstrates our model's superiority over traditional methods (ChemDraw and Mestrenova), achieving Mean Absolute Errors (MAEs) of 2.05 ppm and 0.165 ppm for 13C shifts and 1H shifts respectively. Our algorithmic annotations show a 95.21% concordance with experts' assignments, underscoring the approach's potential for structural elucidation in fields like organic chemistry, pharmaceuticals, and natural products.

求助全文
约1分钟内获得全文 求助全文
来源期刊
Communications Chemistry
Communications Chemistry Chemistry-General Chemistry
CiteScore
7.70
自引率
1.70%
发文量
146
审稿时长
13 weeks
期刊介绍: Communications Chemistry is an open access journal from Nature Research publishing high-quality research, reviews and commentary in all areas of the chemical sciences. Research papers published by the journal represent significant advances bringing new chemical insight to a specialized area of research. We also aim to provide a community forum for issues of importance to all chemists, regardless of sub-discipline.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信