Diverging errors: A comparison of DFT and machine-learning predictions of NMR shieldings

IF 1.8 3区 化学 Q4 CHEMISTRY, PHYSICAL
Ema Chaloupecká , Ondřej Socha , Martin Dračínský
{"title":"Diverging errors: A comparison of DFT and machine-learning predictions of NMR shieldings","authors":"Ema Chaloupecká ,&nbsp;Ondřej Socha ,&nbsp;Martin Dračínský","doi":"10.1016/j.ssnmr.2025.102019","DOIUrl":null,"url":null,"abstract":"<div><div>Accurate prediction of NMR parameters from first principles is essential for the structural characterization of molecular solids. Recent studies have shown that single-molecule correction schemes—based on hybrid DFT calculations—can significantly improve the accuracy of periodic DFT predictions of nuclear shieldings. Here, we evaluate the performance of this correction approach not only for periodic DFT calculations but also for ShiftML2, a machine-learning model trained on PBE-calculated NMR data. For <sup>13</sup>C nuclei, the application of single-molecule PBE0 corrections to periodic PBE shieldings has reduced the root-mean-square deviation (RMSD) from 2.18 to 1.20 ppm, with negligible improvement observed for <sup>1</sup>H. When applied to ShiftML2 predictions, the corrections have yielded a smaller reduction in <sup>13</sup>C RMSD (from 3.02 to 2.51 ppm); again, they have had minimal impact on <sup>1</sup>H predictions. Residual analysis has revealed weak correlation between DFT and ML errors, suggesting that while some sources of systematic deviation may be shared, others are likely distinct. These results demonstrate that DFT-specific correction schemes do not straightforwardly translate to machine-learning models, highlighting the need for ML-tailored post-processing or retraining strategies. The findings have important implications for the integration of machine learning into high-throughput NMR workflows and the development of more accurate predictive tools for solid-state spectroscopy.</div></div>","PeriodicalId":21937,"journal":{"name":"Solid state nuclear magnetic resonance","volume":"138 ","pages":"Article 102019"},"PeriodicalIF":1.8000,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Solid state nuclear magnetic resonance","FirstCategoryId":"92","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0926204025000359","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Accurate prediction of NMR parameters from first principles is essential for the structural characterization of molecular solids. Recent studies have shown that single-molecule correction schemes—based on hybrid DFT calculations—can significantly improve the accuracy of periodic DFT predictions of nuclear shieldings. Here, we evaluate the performance of this correction approach not only for periodic DFT calculations but also for ShiftML2, a machine-learning model trained on PBE-calculated NMR data. For 13C nuclei, the application of single-molecule PBE0 corrections to periodic PBE shieldings has reduced the root-mean-square deviation (RMSD) from 2.18 to 1.20 ppm, with negligible improvement observed for 1H. When applied to ShiftML2 predictions, the corrections have yielded a smaller reduction in 13C RMSD (from 3.02 to 2.51 ppm); again, they have had minimal impact on 1H predictions. Residual analysis has revealed weak correlation between DFT and ML errors, suggesting that while some sources of systematic deviation may be shared, others are likely distinct. These results demonstrate that DFT-specific correction schemes do not straightforwardly translate to machine-learning models, highlighting the need for ML-tailored post-processing or retraining strategies. The findings have important implications for the integration of machine learning into high-throughput NMR workflows and the development of more accurate predictive tools for solid-state spectroscopy.

Abstract Image

发散误差:核磁共振屏蔽的DFT和机器学习预测的比较
从第一性原理准确预测核磁共振参数对分子固体的结构表征至关重要。最近的研究表明,基于混合DFT计算的单分子校正方案可以显著提高核屏蔽周期DFT预测的准确性。在这里,我们不仅评估了这种校正方法在周期性DFT计算中的性能,还评估了ShiftML2(一种基于pbe计算的核磁共振数据训练的机器学习模型)的性能。对于13C原子核,将单分子PBE0校正应用于周期性PBE屏蔽,将均方根偏差(RMSD)从2.18 ppm降低到1.20 ppm,在1H内观察到的改善可以忽略不计。当应用于ShiftML2预测时,修正产生了13C RMSD的较小减少(从3.02到2.51 ppm);同样,它们对1H预测的影响微乎其微。残差分析揭示了DFT和ML误差之间的弱相关性,这表明虽然系统偏差的一些来源可能是共享的,但其他来源可能是不同的。这些结果表明,dft特定的校正方案不能直接转化为机器学习模型,突出了对ml定制的后处理或再训练策略的需求。这些发现对于将机器学习集成到高通量NMR工作流程以及开发更准确的固态光谱预测工具具有重要意义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
5.30
自引率
9.40%
发文量
42
审稿时长
72 days
期刊介绍: The journal Solid State Nuclear Magnetic Resonance publishes original manuscripts of high scientific quality dealing with all experimental and theoretical aspects of solid state NMR. This includes advances in instrumentation, development of new experimental techniques and methodology, new theoretical insights, new data processing and simulation methods, and original applications of established or novel methods to scientific problems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信