Bridging the Gap Between High-Level Quantum Chemical Methods and Deep Learning Models

Viki Kumar Prasad, Alberto Otero-de-la-Roza, G. Dilabio
{"title":"Bridging the Gap Between High-Level Quantum Chemical Methods and Deep Learning Models","authors":"Viki Kumar Prasad, Alberto Otero-de-la-Roza, G. Dilabio","doi":"10.1088/2632-2153/ad27e1","DOIUrl":null,"url":null,"abstract":"\n Supervised deep learning (DL) models are becoming ubiquitous in computational chemistry because they can efficiently learn complex input-output relationships and predict chemical properties at a cost significantly lower than methods based on quantum mechanics. The central challenge in many deep learning applications is the need to invest considerable computational resources in generating large (N > 1e5) training sets such that the resulting DL model can be generalized reliably to unseen systems. The lack of better alternatives has encouraged the use of low-cost and relatively inaccurate density-functional theory (DFT) methods to generate training data, leading to DL models that lack accuracy and reliability. In this article, we describe a robust and easily implemented approach based on property-specific atom-centered potentials (ACPs) that resolves this central challenge in DL model development. ACPs are one-electron potentials that are applied in combination with a cheap but inaccurate quantum mechanical method (e.g.\\ double-$\\zeta$ DFT) and fitted against relatively few high-level data ($N \\approx \\num{1e3}$--$\\num{1e4}$), possibly obtained from the literature. The resulting ACP-corrected methods retain the low cost of the double-$\\zeta$ DFT approach, while generating high-level-quality data in unseen systems for the specific property for which they were designed. With this approach, we demonstrate that ACPs can be used as an intermediate method between high-level approaches and DL model development, enabling the calculation of large and accurate DL training sets for the chemical property of interest. We demonstrate the effectiveness of the proposed approach by predicting bond dissociation enthalpies, reaction barrier heights, and reaction energies with chemical accuracy at a computational cost lower than the DFT methods routinely used for DL training data set generation.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":" 27","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning: Science and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/2632-2153/ad27e1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Supervised deep learning (DL) models are becoming ubiquitous in computational chemistry because they can efficiently learn complex input-output relationships and predict chemical properties at a cost significantly lower than methods based on quantum mechanics. The central challenge in many deep learning applications is the need to invest considerable computational resources in generating large (N > 1e5) training sets such that the resulting DL model can be generalized reliably to unseen systems. The lack of better alternatives has encouraged the use of low-cost and relatively inaccurate density-functional theory (DFT) methods to generate training data, leading to DL models that lack accuracy and reliability. In this article, we describe a robust and easily implemented approach based on property-specific atom-centered potentials (ACPs) that resolves this central challenge in DL model development. ACPs are one-electron potentials that are applied in combination with a cheap but inaccurate quantum mechanical method (e.g.\ double-$\zeta$ DFT) and fitted against relatively few high-level data ($N \approx \num{1e3}$--$\num{1e4}$), possibly obtained from the literature. The resulting ACP-corrected methods retain the low cost of the double-$\zeta$ DFT approach, while generating high-level-quality data in unseen systems for the specific property for which they were designed. With this approach, we demonstrate that ACPs can be used as an intermediate method between high-level approaches and DL model development, enabling the calculation of large and accurate DL training sets for the chemical property of interest. We demonstrate the effectiveness of the proposed approach by predicting bond dissociation enthalpies, reaction barrier heights, and reaction energies with chemical accuracy at a computational cost lower than the DFT methods routinely used for DL training data set generation.
缩小高层量子化学方法与深度学习模型之间的差距
有监督的深度学习(DL)模型在计算化学中正变得无处不在,因为它们可以高效地学习复杂的输入输出关系,并以比基于量子力学的方法低得多的成本预测化学性质。许多深度学习应用面临的核心挑战是,需要投入大量计算资源来生成大量(N > 1e5)训练集,从而使生成的 DL 模型能够可靠地泛化到未见系统中。由于缺乏更好的替代方法,人们鼓励使用低成本且相对不准确的密度泛函理论(DFT)方法来生成训练数据,从而导致 DL 模型缺乏准确性和可靠性。在本文中,我们介绍了一种基于特定性质原子中心势(ACPs)的稳健且易于实施的方法,它解决了 DL 模型开发中的这一核心难题。ACP 是一种单电子势,它与廉价但不准确的量子力学方法(如双zeta DFT)结合使用,并与相对较少的高级数据($N \approx \num{1e3}$--$num{1e4}$)进行拟合,这些数据可能是从文献中获得的。由此产生的 ACP 校正方法既保留了双($\zeta$)DFT 方法的低成本,又能在未见系统中生成高水平的高质量数据,从而实现其设计的特定属性。通过这种方法,我们证明了 ACP 可用作高层次方法和 DL 模型开发之间的中间方法,从而能够计算出针对相关化学性质的大量精确 DL 训练集。通过预测化学键解离焓、反应势垒高度和反应能量,我们证明了所提议方法的有效性,其计算成本低于用于生成 DL 训练数据集的 DFT 方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信