基于小细胞的机器学习原子间电势的快速主动学习

IF 3.1 3区材料科学 Q2 MATERIALS SCIENCE, MULTIDISCIPLINARY

Computational Materials Science Pub Date : 2025-05-10 DOI:10.1016/j.commatsci.2025.113919

Zijian Meng , Hao Sun , Edmanuel Torres , Christopher Maxwell , Ryan Eric Grant , Laurent Karim Béland

{"title":"基于小细胞的机器学习原子间电势的快速主动学习","authors":"Zijian Meng , Hao Sun , Edmanuel Torres , Christopher Maxwell , Ryan Eric Grant , Laurent Karim Béland","doi":"10.1016/j.commatsci.2025.113919","DOIUrl":null,"url":null,"abstract":"<div><div>Machine learning interatomic potentials (MLIPs) are often trained with on-the-fly active learning, where sampled configurations from atomistic simulations are added to the training set. However, this approach is limited by the high computational cost of <em>ab initio</em> calculations for large systems. Recent works have shown that MLIPs trained on small cells (1–8 atoms) rival the accuracy of large-cell models (100s of atoms) at far lower computational cost. Herein, we refer to these as small-cell and large-cell training, respectively. In this work, we iterate on earlier small-cell training approaches and characterize our resultant small-cell protocol. Potassium and sodium-potassium systems were studied: the former, a simpler system benchmarked in detail; the latter, a more complex binary system for further validation. Our small-cell training approach achieves up to two orders of magnitude of cost savings compared to large-cell (54-atom) training, with some training runs requiring fewer than 120 core-hours. Static and thermodynamic properties predicted using the MLIPs were evaluated, with small-cell training in both systems yielding strong <em>ab initio</em> agreement. Small cells appear to encode the necessary information to model complex large-scale phenomena—solid-liquid interfaces, critical exponents, diverse concentrations—even when the training cells themselves are too small to accommodate these phenomena. Based on these tests, we provide analysis and recommendations.</div></div>","PeriodicalId":10650,"journal":{"name":"Computational Materials Science","volume":"256 ","pages":"Article 113919"},"PeriodicalIF":3.1000,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Small-cell-based fast active learning of machine learning interatomic potentials\",\"authors\":\"Zijian Meng , Hao Sun , Edmanuel Torres , Christopher Maxwell , Ryan Eric Grant , Laurent Karim Béland\",\"doi\":\"10.1016/j.commatsci.2025.113919\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Machine learning interatomic potentials (MLIPs) are often trained with on-the-fly active learning, where sampled configurations from atomistic simulations are added to the training set. However, this approach is limited by the high computational cost of <em>ab initio</em> calculations for large systems. Recent works have shown that MLIPs trained on small cells (1–8 atoms) rival the accuracy of large-cell models (100s of atoms) at far lower computational cost. Herein, we refer to these as small-cell and large-cell training, respectively. In this work, we iterate on earlier small-cell training approaches and characterize our resultant small-cell protocol. Potassium and sodium-potassium systems were studied: the former, a simpler system benchmarked in detail; the latter, a more complex binary system for further validation. Our small-cell training approach achieves up to two orders of magnitude of cost savings compared to large-cell (54-atom) training, with some training runs requiring fewer than 120 core-hours. Static and thermodynamic properties predicted using the MLIPs were evaluated, with small-cell training in both systems yielding strong <em>ab initio</em> agreement. Small cells appear to encode the necessary information to model complex large-scale phenomena—solid-liquid interfaces, critical exponents, diverse concentrations—even when the training cells themselves are too small to accommodate these phenomena. Based on these tests, we provide analysis and recommendations.</div></div>\",\"PeriodicalId\":10650,\"journal\":{\"name\":\"Computational Materials Science\",\"volume\":\"256 \",\"pages\":\"Article 113919\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-05-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational Materials Science\",\"FirstCategoryId\":\"88\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0927025625002629\",\"RegionNum\":3,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATERIALS SCIENCE, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Materials Science","FirstCategoryId":"88","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0927025625002629","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

摘要

机器学习原子间势（MLIPs）通常使用动态主动学习进行训练，其中从原子模拟中采样的配置被添加到训练集中。然而，这种方法受到大型系统从头计算的高计算成本的限制。最近的研究表明，在小细胞（1-8个原子）上训练的MLIPs的精度可以与大细胞模型（100个原子）相媲美，而计算成本要低得多。在这里，我们将它们分别称为小单元训练和大单元训练。在这项工作中，我们迭代了早期的小细胞训练方法，并描述了我们得到的小细胞协议。对钾和钠钾体系进行了研究：前者是一种较为简单的体系，对其进行了详细的基准测试；后者是一个更复杂的二进制系统，用于进一步验证。与大单元（54个原子）训练相比，我们的小单元训练方法节省了两个数量级的成本，一些训练运行需要少于120个核心小时。利用mlip预测的静态和热力学性质进行了评估，在两个系统中进行小细胞训练，得到了很强的从头算一致性。小细胞似乎编码了必要的信息来模拟复杂的大规模现象——固液界面、临界指数、不同浓度——即使当训练细胞本身太小而无法容纳这些现象时。基于这些测试，我们提供分析和建议。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Small-cell-based fast active learning of machine learning interatomic potentials

查看原文本刊更多论文

Small-cell-based fast active learning of machine learning interatomic potentials

Machine learning interatomic potentials (MLIPs) are often trained with on-the-fly active learning, where sampled configurations from atomistic simulations are added to the training set. However, this approach is limited by the high computational cost of ab initio calculations for large systems. Recent works have shown that MLIPs trained on small cells (1–8 atoms) rival the accuracy of large-cell models (100s of atoms) at far lower computational cost. Herein, we refer to these as small-cell and large-cell training, respectively. In this work, we iterate on earlier small-cell training approaches and characterize our resultant small-cell protocol. Potassium and sodium-potassium systems were studied: the former, a simpler system benchmarked in detail; the latter, a more complex binary system for further validation. Our small-cell training approach achieves up to two orders of magnitude of cost savings compared to large-cell (54-atom) training, with some training runs requiring fewer than 120 core-hours. Static and thermodynamic properties predicted using the MLIPs were evaluated, with small-cell training in both systems yielding strong ab initio agreement. Small cells appear to encode the necessary information to model complex large-scale phenomena—solid-liquid interfaces, critical exponents, diverse concentrations—even when the training cells themselves are too small to accommodate these phenomena. Based on these tests, we provide analysis and recommendations.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computational Materials Science 工程技术-材料科学：综合

CiteScore

6.50

自引率

6.10%

发文量

665

审稿时长

26 days

期刊介绍： The goal of Computational Materials Science is to report on results that provide new or unique insights into, or significantly expand our understanding of, the properties of materials or phenomena associated with their design, synthesis, processing, characterization, and utilization. To be relevant to the journal, the results should be applied or applicable to specific material systems that are discussed within the submission.