Data-efficient multi-fidelity training for high-fidelity machine learning interatomic potentials

Jaesun Kim, Jisu Kim, Jaehoon Kim, Jiho Lee, Yutack Park, Youngho Kang, Seungwu Han
{"title":"Data-efficient multi-fidelity training for high-fidelity machine learning interatomic potentials","authors":"Jaesun Kim, Jisu Kim, Jaehoon Kim, Jiho Lee, Yutack Park, Youngho Kang, Seungwu Han","doi":"arxiv-2409.07947","DOIUrl":null,"url":null,"abstract":"Machine learning interatomic potentials (MLIPs) are used to estimate\npotential energy surfaces (PES) from ab initio calculations, providing near\nquantum-level accuracy with reduced computational costs. However, the high cost\nof assembling high-fidelity databases hampers the application of MLIPs to\nsystems that require high chemical accuracy. Utilizing an equivariant graph\nneural network, we present an MLIP framework that trains on multi-fidelity\ndatabases simultaneously. This approach enables the accurate learning of\nhigh-fidelity PES with minimal high-fidelity data. We test this framework on\nthe Li$_6$PS$_5$Cl and In$_x$Ga$_{1-x}$N systems. The computational results\nindicate that geometric and compositional spaces not covered by the\nhigh-fidelity meta-gradient generalized approximation (meta-GGA) database can\nbe effectively inferred from low-fidelity GGA data, thus enhancing accuracy and\nmolecular dynamics stability. We also develop a general-purpose MLIP that\nutilizes both GGA and meta-GGA data from the Materials Project, significantly\nenhancing MLIP performance for high-accuracy tasks such as predicting energies\nabove hull for crystals in general. Furthermore, we demonstrate that the\npresent multi-fidelity learning is more effective than transfer learning or\n$\\Delta$-learning an d that it can also be applied to learn higher-fidelity up\nto the coupled-cluster level. We believe this methodology holds promise for\ncreating highly accurate bespoke or universal MLIPs by effectively expanding\nthe high-fidelity dataset.","PeriodicalId":501234,"journal":{"name":"arXiv - PHYS - Materials Science","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Materials Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07947","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Machine learning interatomic potentials (MLIPs) are used to estimate potential energy surfaces (PES) from ab initio calculations, providing near quantum-level accuracy with reduced computational costs. However, the high cost of assembling high-fidelity databases hampers the application of MLIPs to systems that require high chemical accuracy. Utilizing an equivariant graph neural network, we present an MLIP framework that trains on multi-fidelity databases simultaneously. This approach enables the accurate learning of high-fidelity PES with minimal high-fidelity data. We test this framework on the Li$_6$PS$_5$Cl and In$_x$Ga$_{1-x}$N systems. The computational results indicate that geometric and compositional spaces not covered by the high-fidelity meta-gradient generalized approximation (meta-GGA) database can be effectively inferred from low-fidelity GGA data, thus enhancing accuracy and molecular dynamics stability. We also develop a general-purpose MLIP that utilizes both GGA and meta-GGA data from the Materials Project, significantly enhancing MLIP performance for high-accuracy tasks such as predicting energies above hull for crystals in general. Furthermore, we demonstrate that the present multi-fidelity learning is more effective than transfer learning or $\Delta$-learning an d that it can also be applied to learn higher-fidelity up to the coupled-cluster level. We believe this methodology holds promise for creating highly accurate bespoke or universal MLIPs by effectively expanding the high-fidelity dataset.
为高保真机器学习原子间势提供数据高效的多保真训练
机器学习原子间势(MLIPs)用于从原子序数计算中估算势能面(PES),在降低计算成本的同时提供接近量子级的精度。然而,组建高保真数据库的高昂成本阻碍了 MLIPs 在需要高化学精度的系统中的应用。利用等变量图神经网络,我们提出了一种同时在多保真度数据库上进行训练的 MLIP 框架。这种方法可以用最少的高保真数据准确学习高保真 PES。我们在 Li$_6$PS$_5$Cl 和 In$_x$Ga$_{1-x}$N 系统上测试了这一框架。计算结果表明,高保真元梯度广义近似(meta-GGA)数据库未覆盖的几何和成分空间可以有效地从低保真 GGA 数据中推断出来,从而提高了准确性和分子动力学稳定性。我们还开发了一种通用 MLIP,可同时利用材料项目的 GGA 和元 GGA 数据,显著提高了 MLIP 在高精度任务中的性能,如预测一般晶体的壳体以上能量。此外,我们还证明了目前的多保真度学习比迁移学习或Δ学习更有效,而且它还可以应用于更高保真度的学习,直至耦合簇水平。我们相信,这种方法有望通过有效扩展高保真数据集,创建高精度的定制或通用 MLIP。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信