Diverging errors: A comparison of DFT and machine-learning predictions of NMR shieldings

IF 1.8 3区化学 Q4 CHEMISTRY, PHYSICAL

Solid state nuclear magnetic resonance Pub Date : 2025-06-26 DOI:10.1016/j.ssnmr.2025.102019

Ema Chaloupecká , Ondřej Socha , Martin Dračínský

{"title":"Diverging errors: A comparison of DFT and machine-learning predictions of NMR shieldings","authors":"Ema Chaloupecká , Ondřej Socha , Martin Dračínský","doi":"10.1016/j.ssnmr.2025.102019","DOIUrl":null,"url":null,"abstract":"<div><div>Accurate prediction of NMR parameters from first principles is essential for the structural characterization of molecular solids. Recent studies have shown that single-molecule correction schemes—based on hybrid DFT calculations—can significantly improve the accuracy of periodic DFT predictions of nuclear shieldings. Here, we evaluate the performance of this correction approach not only for periodic DFT calculations but also for ShiftML2, a machine-learning model trained on PBE-calculated NMR data. For <sup>13</sup>C nuclei, the application of single-molecule PBE0 corrections to periodic PBE shieldings has reduced the root-mean-square deviation (RMSD) from 2.18 to 1.20 ppm, with negligible improvement observed for <sup>1</sup>H. When applied to ShiftML2 predictions, the corrections have yielded a smaller reduction in <sup>13</sup>C RMSD (from 3.02 to 2.51 ppm); again, they have had minimal impact on <sup>1</sup>H predictions. Residual analysis has revealed weak correlation between DFT and ML errors, suggesting that while some sources of systematic deviation may be shared, others are likely distinct. These results demonstrate that DFT-specific correction schemes do not straightforwardly translate to machine-learning models, highlighting the need for ML-tailored post-processing or retraining strategies. The findings have important implications for the integration of machine learning into high-throughput NMR workflows and the development of more accurate predictive tools for solid-state spectroscopy.</div></div>","PeriodicalId":21937,"journal":{"name":"Solid state nuclear magnetic resonance","volume":"138 ","pages":"Article 102019"},"PeriodicalIF":1.8000,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Solid state nuclear magnetic resonance","FirstCategoryId":"92","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0926204025000359","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Accurate prediction of NMR parameters from first principles is essential for the structural characterization of molecular solids. Recent studies have shown that single-molecule correction schemes—based on hybrid DFT calculations—can significantly improve the accuracy of periodic DFT predictions of nuclear shieldings. Here, we evaluate the performance of this correction approach not only for periodic DFT calculations but also for ShiftML2, a machine-learning model trained on PBE-calculated NMR data. For ¹³C nuclei, the application of single-molecule PBE0 corrections to periodic PBE shieldings has reduced the root-mean-square deviation (RMSD) from 2.18 to 1.20 ppm, with negligible improvement observed for ¹H. When applied to ShiftML2 predictions, the corrections have yielded a smaller reduction in ¹³C RMSD (from 3.02 to 2.51 ppm); again, they have had minimal impact on ¹H predictions. Residual analysis has revealed weak correlation between DFT and ML errors, suggesting that while some sources of systematic deviation may be shared, others are likely distinct. These results demonstrate that DFT-specific correction schemes do not straightforwardly translate to machine-learning models, highlighting the need for ML-tailored post-processing or retraining strategies. The findings have important implications for the integration of machine learning into high-throughput NMR workflows and the development of more accurate predictive tools for solid-state spectroscopy.

Abstract Image

查看原文本刊更多论文

发散误差：核磁共振屏蔽的DFT和机器学习预测的比较

从第一性原理准确预测核磁共振参数对分子固体的结构表征至关重要。最近的研究表明，基于混合DFT计算的单分子校正方案可以显著提高核屏蔽周期DFT预测的准确性。在这里，我们不仅评估了这种校正方法在周期性DFT计算中的性能，还评估了ShiftML2（一种基于pbe计算的核磁共振数据训练的机器学习模型）的性能。对于13C原子核，将单分子PBE0校正应用于周期性PBE屏蔽，将均方根偏差（RMSD）从2.18 ppm降低到1.20 ppm，在1H内观察到的改善可以忽略不计。当应用于ShiftML2预测时，修正产生了13C RMSD的较小减少（从3.02到2.51 ppm）；同样，它们对1H预测的影响微乎其微。残差分析揭示了DFT和ML误差之间的弱相关性，这表明虽然系统偏差的一些来源可能是共享的，但其他来源可能是不同的。这些结果表明，dft特定的校正方案不能直接转化为机器学习模型，突出了对ml定制的后处理或再训练策略的需求。这些发现对于将机器学习集成到高通量NMR工作流程以及开发更准确的固态光谱预测工具具有重要意义。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Solid state nuclear magnetic resonance 物理-光谱学

CiteScore

5.30

自引率

9.40%

发文量

审稿时长

72 days

期刊介绍： The journal Solid State Nuclear Magnetic Resonance publishes original manuscripts of high scientific quality dealing with all experimental and theoretical aspects of solid state NMR. This includes advances in instrumentation, development of new experimental techniques and methodology, new theoretical insights, new data processing and simulation methods, and original applications of established or novel methods to scientific problems.