高保真图深度学习原子间位势的数据高效构建

arXiv - PHYS - Computational Physics Pub Date : 2024-09-02 DOI:arxiv-2409.00957

Tsz Wai Ko, Shyue Ping Ong

{"title":"高保真图深度学习原子间位势的数据高效构建","authors":"Tsz Wai Ko, Shyue Ping Ong","doi":"arxiv-2409.00957","DOIUrl":null,"url":null,"abstract":"Machine learning potentials (MLPs) have become an indispensable tool in\nlarge-scale atomistic simulations because of their ability to reproduce ab\ninitio potential energy surfaces (PESs) very accurately at a fraction of\ncomputational cost. For computational efficiency, the training data for most\nMLPs today are computed using relatively cheap density functional theory (DFT)\nmethods such as the Perdew-Burke-Ernzerhof (PBE) generalized gradient\napproximation (GGA) functional. Meta-GGAs such as the recently developed\nstrongly constrained and appropriately normed (SCAN) functional have been shown\nto yield significantly improved descriptions of atomic interactions for\ndiversely bonded systems, but their higher computational cost remains an\nimpediment to their use in MLP development. In this work, we outline a\ndata-efficient multi-fidelity approach to constructing Materials 3-body Graph\nNetwork (M3GNet) interatomic potentials that integrate different levels of\ntheory within a single model. Using silicon and water as examples, we show that\na multi-fidelity M3GNet model trained on a combined dataset of low-fidelity GGA\ncalculations with 10% of high-fidelity SCAN calculations can achieve accuracies\ncomparable to a single-fidelity M3GNet model trained on a dataset comprising 8x\nthe number of SCAN calculations. This work paves the way for the development of\nhigh-fidelity MLPs in a cost-effective manner by leveraging existing\nlow-fidelity datasets.","PeriodicalId":501369,"journal":{"name":"arXiv - PHYS - Computational Physics","volume":"9 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Data-Efficient Construction of High-Fidelity Graph Deep Learning Interatomic Potentials\",\"authors\":\"Tsz Wai Ko, Shyue Ping Ong\",\"doi\":\"arxiv-2409.00957\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine learning potentials (MLPs) have become an indispensable tool in\\nlarge-scale atomistic simulations because of their ability to reproduce ab\\ninitio potential energy surfaces (PESs) very accurately at a fraction of\\ncomputational cost. For computational efficiency, the training data for most\\nMLPs today are computed using relatively cheap density functional theory (DFT)\\nmethods such as the Perdew-Burke-Ernzerhof (PBE) generalized gradient\\napproximation (GGA) functional. Meta-GGAs such as the recently developed\\nstrongly constrained and appropriately normed (SCAN) functional have been shown\\nto yield significantly improved descriptions of atomic interactions for\\ndiversely bonded systems, but their higher computational cost remains an\\nimpediment to their use in MLP development. In this work, we outline a\\ndata-efficient multi-fidelity approach to constructing Materials 3-body Graph\\nNetwork (M3GNet) interatomic potentials that integrate different levels of\\ntheory within a single model. Using silicon and water as examples, we show that\\na multi-fidelity M3GNet model trained on a combined dataset of low-fidelity GGA\\ncalculations with 10% of high-fidelity SCAN calculations can achieve accuracies\\ncomparable to a single-fidelity M3GNet model trained on a dataset comprising 8x\\nthe number of SCAN calculations. This work paves the way for the development of\\nhigh-fidelity MLPs in a cost-effective manner by leveraging existing\\nlow-fidelity datasets.\",\"PeriodicalId\":501369,\"journal\":{\"name\":\"arXiv - PHYS - Computational Physics\",\"volume\":\"9 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - PHYS - Computational Physics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.00957\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Computational Physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.00957","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

机器学习势能（MLP）已成为大规模原子模拟中不可或缺的工具，因为它们能够以极低的计算成本非常精确地再现非线性势能面（PES）。为了提高计算效率，目前大多数 MLP 的训练数据都是使用相对便宜的密度泛函理论（DFT）方法计算的，例如 Perdew-Burke-Ernzerhof (PBE) 广义梯度逼近（GGA）函数。元 GGA（如最近开发的强约束和适当规范化（SCAN）函数）已被证明能显著改善对不同键合体系的原子相互作用的描述，但其较高的计算成本仍然是将其用于 MLP 开发的障碍。在这项工作中，我们概述了构建材料三体图网（M3GNet）原子间位势的数据高效多保真度方法，该方法在单一模型中集成了不同层次的理论。我们以硅和水为例，展示了在低保真度 GGA 计算和 10% 高保真 SCAN 计算的组合数据集上训练的多保真度 M3GNet 模型，其精确度可与在包含 8 倍 SCAN 计算的数据集上训练的单保真度 M3GNet 模型相媲美。这项工作为利用现有的低保真数据集以经济高效的方式开发高保真 MLP 铺平了道路。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Data-Efficient Construction of High-Fidelity Graph Deep Learning Interatomic Potentials

Machine learning potentials (MLPs) have become an indispensable tool in large-scale atomistic simulations because of their ability to reproduce ab initio potential energy surfaces (PESs) very accurately at a fraction of computational cost. For computational efficiency, the training data for most MLPs today are computed using relatively cheap density functional theory (DFT) methods such as the Perdew-Burke-Ernzerhof (PBE) generalized gradient approximation (GGA) functional. Meta-GGAs such as the recently developed strongly constrained and appropriately normed (SCAN) functional have been shown to yield significantly improved descriptions of atomic interactions for diversely bonded systems, but their higher computational cost remains an impediment to their use in MLP development. In this work, we outline a data-efficient multi-fidelity approach to constructing Materials 3-body Graph Network (M3GNet) interatomic potentials that integrate different levels of theory within a single model. Using silicon and water as examples, we show that a multi-fidelity M3GNet model trained on a combined dataset of low-fidelity GGA calculations with 10% of high-fidelity SCAN calculations can achieve accuracies comparable to a single-fidelity M3GNet model trained on a dataset comprising 8x the number of SCAN calculations. This work paves the way for the development of high-fidelity MLPs in a cost-effective manner by leveraging existing low-fidelity datasets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - PHYS - Computational Physics

自引率

0.00%

发文量