Gaussian processes for finite size extrapolation of many-body simulations†

IF 3.4 3区 化学 Q2 Chemistry
Edgar Josué Landinez Borda, Kenneth O. Berard, Annette Lopez and Brenda Rubenstein
{"title":"Gaussian processes for finite size extrapolation of many-body simulations†","authors":"Edgar Josué Landinez Borda, Kenneth O. Berard, Annette Lopez and Brenda Rubenstein","doi":"10.1039/D4FD00051J","DOIUrl":null,"url":null,"abstract":"<p >Key to being able to accurately model the properties of realistic materials is being able to predict their properties in the thermodynamic limit. Nevertheless, because most many-body electronic structure methods scale as a high-order polynomial, or even exponentially, with system size, directly simulating large systems in their thermodynamic limit rapidly becomes computationally intractable. As a result, researchers typically estimate the properties of large systems that approach the thermodynamic limit by extrapolating the properties of smaller, computationally-accessible systems based on relatively simple scaling expressions. In this work, we employ Gaussian processes to more accurately and efficiently extrapolate many-body simulations to their thermodynamic limit. We train our Gaussian processes on Smooth Overlap of Atomic Positions (SOAP) descriptors to extrapolate the energies of one-dimensional hydrogen chains obtained using two high-accuracy many-body methods: coupled cluster theory and Auxiliary Field Quantum Monte Carlo (AFQMC). In so doing, we show that Gaussian processes trained on relatively short 10–30-atom chains can predict the energies of both homogeneous and inhomogeneous hydrogen chains in their thermodynamic limit with sub-milliHartree accuracy. Unlike standard scaling expressions, our GPR-based approach is highly generalizable given representative training data and is not dependent on systems’ geometries or dimensionality. This work highlights the potential for machine learning to correct for the finite size effects that routinely complicate the interpretation of finite size many-body simulations.</p>","PeriodicalId":49075,"journal":{"name":"Faraday Discussions","volume":"254 ","pages":" 500-528"},"PeriodicalIF":3.4000,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2024/fd/d4fd00051j?page=search","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Faraday Discussions","FirstCategoryId":"92","ListUrlMain":"https://pubs.rsc.org/en/content/articlelanding/2024/fd/d4fd00051j","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Chemistry","Score":null,"Total":0}
引用次数: 0

Abstract

Key to being able to accurately model the properties of realistic materials is being able to predict their properties in the thermodynamic limit. Nevertheless, because most many-body electronic structure methods scale as a high-order polynomial, or even exponentially, with system size, directly simulating large systems in their thermodynamic limit rapidly becomes computationally intractable. As a result, researchers typically estimate the properties of large systems that approach the thermodynamic limit by extrapolating the properties of smaller, computationally-accessible systems based on relatively simple scaling expressions. In this work, we employ Gaussian processes to more accurately and efficiently extrapolate many-body simulations to their thermodynamic limit. We train our Gaussian processes on Smooth Overlap of Atomic Positions (SOAP) descriptors to extrapolate the energies of one-dimensional hydrogen chains obtained using two high-accuracy many-body methods: coupled cluster theory and Auxiliary Field Quantum Monte Carlo (AFQMC). In so doing, we show that Gaussian processes trained on relatively short 10–30-atom chains can predict the energies of both homogeneous and inhomogeneous hydrogen chains in their thermodynamic limit with sub-milliHartree accuracy. Unlike standard scaling expressions, our GPR-based approach is highly generalizable given representative training data and is not dependent on systems’ geometries or dimensionality. This work highlights the potential for machine learning to correct for the finite size effects that routinely complicate the interpretation of finite size many-body simulations.

Abstract Image

用于多体模拟有限尺寸外推法的高斯过程
能够准确模拟现实材料特性的关键在于能够预测其热力学极限特性。然而,由于大多数多体电子结构方法的规模与系统大小成高阶多项式关系,甚至是指数关系,因此直接模拟热力学极限的大型系统很快就会变得难以计算。因此,研究人员通常根据相对简单的缩放表达式,通过推断较小的、可计算的系统的性质,来估计接近热力学极限的大型系统的性质。在这项工作中,我们采用高斯过程来更准确、更高效地推断多体模拟的热力学极限。我们在原子位置平滑重叠(SOAP)描述符上训练高斯过程,以推断使用两种高精度多体方法获得的一维氢链的能量:耦合簇理论和辅助场量子蒙特卡罗(AFQMC)。在此过程中,我们展示了在相对较短的 10-30 原子链上训练的高斯过程可以在热力学极限中以亚毫微哈特里精度预测均相和不均相氢链的能量。与标准的缩放表达式不同,我们基于 GPR 的方法在给出具有代表性的训练数据时具有很强的通用性,并且不依赖于系统的几何形状或维度。这项工作凸显了机器学习校正有限尺寸效应的潜力,这种效应通常会使有限尺寸多体模拟的解释复杂化。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Faraday Discussions
Faraday Discussions CHEMISTRY, PHYSICAL-
CiteScore
4.90
自引率
0.00%
发文量
259
审稿时长
2.8 months
期刊介绍: Discussion summary and research papers from discussion meetings that focus on rapidly developing areas of physical chemistry and its interfaces
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信