Zijian Meng , Hao Sun , Edmanuel Torres , Christopher Maxwell , Ryan Eric Grant , Laurent Karim Béland
{"title":"基于小细胞的机器学习原子间电势的快速主动学习","authors":"Zijian Meng , Hao Sun , Edmanuel Torres , Christopher Maxwell , Ryan Eric Grant , Laurent Karim Béland","doi":"10.1016/j.commatsci.2025.113919","DOIUrl":null,"url":null,"abstract":"<div><div>Machine learning interatomic potentials (MLIPs) are often trained with on-the-fly active learning, where sampled configurations from atomistic simulations are added to the training set. However, this approach is limited by the high computational cost of <em>ab initio</em> calculations for large systems. Recent works have shown that MLIPs trained on small cells (1–8 atoms) rival the accuracy of large-cell models (100s of atoms) at far lower computational cost. Herein, we refer to these as small-cell and large-cell training, respectively. In this work, we iterate on earlier small-cell training approaches and characterize our resultant small-cell protocol. Potassium and sodium-potassium systems were studied: the former, a simpler system benchmarked in detail; the latter, a more complex binary system for further validation. Our small-cell training approach achieves up to two orders of magnitude of cost savings compared to large-cell (54-atom) training, with some training runs requiring fewer than 120 core-hours. Static and thermodynamic properties predicted using the MLIPs were evaluated, with small-cell training in both systems yielding strong <em>ab initio</em> agreement. Small cells appear to encode the necessary information to model complex large-scale phenomena—solid-liquid interfaces, critical exponents, diverse concentrations—even when the training cells themselves are too small to accommodate these phenomena. Based on these tests, we provide analysis and recommendations.</div></div>","PeriodicalId":10650,"journal":{"name":"Computational Materials Science","volume":"256 ","pages":"Article 113919"},"PeriodicalIF":3.1000,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Small-cell-based fast active learning of machine learning interatomic potentials\",\"authors\":\"Zijian Meng , Hao Sun , Edmanuel Torres , Christopher Maxwell , Ryan Eric Grant , Laurent Karim Béland\",\"doi\":\"10.1016/j.commatsci.2025.113919\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Machine learning interatomic potentials (MLIPs) are often trained with on-the-fly active learning, where sampled configurations from atomistic simulations are added to the training set. However, this approach is limited by the high computational cost of <em>ab initio</em> calculations for large systems. Recent works have shown that MLIPs trained on small cells (1–8 atoms) rival the accuracy of large-cell models (100s of atoms) at far lower computational cost. Herein, we refer to these as small-cell and large-cell training, respectively. In this work, we iterate on earlier small-cell training approaches and characterize our resultant small-cell protocol. Potassium and sodium-potassium systems were studied: the former, a simpler system benchmarked in detail; the latter, a more complex binary system for further validation. Our small-cell training approach achieves up to two orders of magnitude of cost savings compared to large-cell (54-atom) training, with some training runs requiring fewer than 120 core-hours. Static and thermodynamic properties predicted using the MLIPs were evaluated, with small-cell training in both systems yielding strong <em>ab initio</em> agreement. Small cells appear to encode the necessary information to model complex large-scale phenomena—solid-liquid interfaces, critical exponents, diverse concentrations—even when the training cells themselves are too small to accommodate these phenomena. Based on these tests, we provide analysis and recommendations.</div></div>\",\"PeriodicalId\":10650,\"journal\":{\"name\":\"Computational Materials Science\",\"volume\":\"256 \",\"pages\":\"Article 113919\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-05-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational Materials Science\",\"FirstCategoryId\":\"88\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0927025625002629\",\"RegionNum\":3,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATERIALS SCIENCE, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Materials Science","FirstCategoryId":"88","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0927025625002629","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, MULTIDISCIPLINARY","Score":null,"Total":0}
Small-cell-based fast active learning of machine learning interatomic potentials
Machine learning interatomic potentials (MLIPs) are often trained with on-the-fly active learning, where sampled configurations from atomistic simulations are added to the training set. However, this approach is limited by the high computational cost of ab initio calculations for large systems. Recent works have shown that MLIPs trained on small cells (1–8 atoms) rival the accuracy of large-cell models (100s of atoms) at far lower computational cost. Herein, we refer to these as small-cell and large-cell training, respectively. In this work, we iterate on earlier small-cell training approaches and characterize our resultant small-cell protocol. Potassium and sodium-potassium systems were studied: the former, a simpler system benchmarked in detail; the latter, a more complex binary system for further validation. Our small-cell training approach achieves up to two orders of magnitude of cost savings compared to large-cell (54-atom) training, with some training runs requiring fewer than 120 core-hours. Static and thermodynamic properties predicted using the MLIPs were evaluated, with small-cell training in both systems yielding strong ab initio agreement. Small cells appear to encode the necessary information to model complex large-scale phenomena—solid-liquid interfaces, critical exponents, diverse concentrations—even when the training cells themselves are too small to accommodate these phenomena. Based on these tests, we provide analysis and recommendations.
期刊介绍:
The goal of Computational Materials Science is to report on results that provide new or unique insights into, or significantly expand our understanding of, the properties of materials or phenomena associated with their design, synthesis, processing, characterization, and utilization. To be relevant to the journal, the results should be applied or applicable to specific material systems that are discussed within the submission.