{"title":"GPR_calculator: An on-the-fly surrogate model to accelerate massive nudged elastic band calculations","authors":"Isaac Onyango , Byungkyun Kang , Qiang Zhu","doi":"10.1016/j.cpc.2025.109781","DOIUrl":null,"url":null,"abstract":"<div><div>We present <span>GPR_calculator</span>, a package based on Python and C++ programming languages to build an on-the-fly surrogate model using Gaussian Process Regression (GPR) to approximate computationally expensive electronic structure calculations. The key idea is to dynamically train a GPR model during the simulation that can accurately predict energies and forces with uncertainty quantification. When the uncertainty is high, the costly electronic structure calculation is performed to obtain the ground truth data, which is then used to update the GPR model. To illustrate the effectiveness of <span>GPR_calculator</span>, we demonstrate its application in Nudged Elastic Band (NEB) simulations of surface diffusion and reactions, achieving 3-10 times acceleration compared to pure ab initio calculations. The source code is available at <span><span>https://github.com/MaterSim/GPR_calculator</span><svg><path></path></svg></span>.</div></div><div><h3>Program summary</h3><div><em>Program Title:</em> GPR_calculator</div><div><em>CPC Library link to program files:</em> <span><span>https://doi.org/10.17632/vyhpdf9fkh.1</span><svg><path></path></svg></span></div><div><em>Licensing provisions:</em> MIT [1]</div><div><em>Programming language::</em> Python 3 & C++</div><div><em>Nature of problem:</em> Many atomistic simulations—such as geometry optimization, barrier calculations, molecular dynamics, and equation-of-state simulations—require sampling a large number of atomic configurations in a compact phase space. While Density Functional Theory (DFT) provides good accuracy and relatively scalable performance for systems with fewer than hundreds of atoms, it can become prohibitively expensive for massive simulations. This is particularly evident in energy barrier calculations for surface diffusion or reaction studies, where hundreds or thousands of energy and force evaluations are needed.</div><div><em>Solution method:</em> The <span>GPR_calculator</span> is an On-the-Fly Atomistic Calculator based on Gaussian Process Regression (GPR), designed as an add-on module that can be used with the popular Atomic Simulation Environment (ASE). It is essentially a hybrid approach that consists of: (i) a base calculator to provide ground truth reference energy and forces for the given input structure, and (ii) a surrogate model serving as the less expensive approximation trained on-the-fly. When the uncertainty of the GPR prediction exceeds a user-defined threshold, the base calculator is invoked to obtain accurate results and update the GPR model. This adaptive approach ensures accuracy while significantly reducing computational cost.</div></div><div><h3>References</h3><div><ul><li><span>[1]</span><span><div><span><span>https://opensource.org/licenses/MIT</span><svg><path></path></svg></span></div></span></li></ul></div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"316 ","pages":"Article 109781"},"PeriodicalIF":3.4000,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Physics Communications","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010465525002838","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
We present GPR_calculator, a package based on Python and C++ programming languages to build an on-the-fly surrogate model using Gaussian Process Regression (GPR) to approximate computationally expensive electronic structure calculations. The key idea is to dynamically train a GPR model during the simulation that can accurately predict energies and forces with uncertainty quantification. When the uncertainty is high, the costly electronic structure calculation is performed to obtain the ground truth data, which is then used to update the GPR model. To illustrate the effectiveness of GPR_calculator, we demonstrate its application in Nudged Elastic Band (NEB) simulations of surface diffusion and reactions, achieving 3-10 times acceleration compared to pure ab initio calculations. The source code is available at https://github.com/MaterSim/GPR_calculator.
Program summary
Program Title: GPR_calculator
CPC Library link to program files:https://doi.org/10.17632/vyhpdf9fkh.1
Licensing provisions: MIT [1]
Programming language:: Python 3 & C++
Nature of problem: Many atomistic simulations—such as geometry optimization, barrier calculations, molecular dynamics, and equation-of-state simulations—require sampling a large number of atomic configurations in a compact phase space. While Density Functional Theory (DFT) provides good accuracy and relatively scalable performance for systems with fewer than hundreds of atoms, it can become prohibitively expensive for massive simulations. This is particularly evident in energy barrier calculations for surface diffusion or reaction studies, where hundreds or thousands of energy and force evaluations are needed.
Solution method: The GPR_calculator is an On-the-Fly Atomistic Calculator based on Gaussian Process Regression (GPR), designed as an add-on module that can be used with the popular Atomic Simulation Environment (ASE). It is essentially a hybrid approach that consists of: (i) a base calculator to provide ground truth reference energy and forces for the given input structure, and (ii) a surrogate model serving as the less expensive approximation trained on-the-fly. When the uncertainty of the GPR prediction exceeds a user-defined threshold, the base calculator is invoked to obtain accurate results and update the GPR model. This adaptive approach ensures accuracy while significantly reducing computational cost.
期刊介绍:
The focus of CPC is on contemporary computational methods and techniques and their implementation, the effectiveness of which will normally be evidenced by the author(s) within the context of a substantive problem in physics. Within this setting CPC publishes two types of paper.
Computer Programs in Physics (CPiP)
These papers describe significant computer programs to be archived in the CPC Program Library which is held in the Mendeley Data repository. The submitted software must be covered by an approved open source licence. Papers and associated computer programs that address a problem of contemporary interest in physics that cannot be solved by current software are particularly encouraged.
Computational Physics Papers (CP)
These are research papers in, but are not limited to, the following themes across computational physics and related disciplines.
mathematical and numerical methods and algorithms;
computational models including those associated with the design, control and analysis of experiments; and
algebraic computation.
Each will normally include software implementation and performance details. The software implementation should, ideally, be available via GitHub, Zenodo or an institutional repository.In addition, research papers on the impact of advanced computer architecture and special purpose computers on computing in the physical sciences and software topics related to, and of importance in, the physical sciences may be considered.