Jaesun Kim, Jisu Kim, Jaehoon Kim, Jiho Lee, Yutack Park, Youngho Kang, Seungwu Han
{"title":"高保真机器学习原子间势能的数据高效多保真度训练","authors":"Jaesun Kim, Jisu Kim, Jaehoon Kim, Jiho Lee, Yutack Park, Youngho Kang, Seungwu Han","doi":"10.1021/jacs.4c14455","DOIUrl":null,"url":null,"abstract":"Machine learning interatomic potentials (MLIPs) are used to estimate potential energy surfaces (PES) from <i>ab initio</i> calculations, providing near-quantum-level accuracy with reduced computational costs. However, the high cost of assembling high-fidelity databases hampers the application of MLIPs to systems that require high chemical accuracy. Utilizing an equivariant graph neural network, we present an MLIP framework that trains on multifidelity databases simultaneously. This approach enables the accurate learning of high-fidelity PES with minimal high-fidelity data. Employing the generalized gradient approximation (GGA) and meta-GGA as low- and high-fidelity approaches, respectively, we tested this framework on the Li<sub>6</sub>PS<sub>5</sub>Cl and In<sub><i>x</i></sub>Ga<sub>1–<i>x</i></sub>N systems. The results show that using a high-fidelity training set with a size approximately 10% of the low-fidelity set, the multifidelity training framework achieves excellent accuracy, with Li-ion conductivity predictions within 10% error and In<sub><i>x</i></sub>Ga<sub>1–<i>x</i></sub>N mixing energy showing an <i>R</i><sup>2</sup> of 0.98 compared to the reference high-fidelity MLIP results. It indicates that geometric and compositional spaces not covered by the high-fidelity meta-GGA database can be effectively inferred from low-fidelity GGA data, thus enhancing accuracy and molecular dynamics stability. We also developed a general-purpose MLIP that utilizes both GGA and meta-GGA data from the Materials Project, significantly enhancing MLIP performance for high-accuracy tasks such as predicting energies above hull for crystals in general. Furthermore, we demonstrate that the present multifidelity learning is more effective than transfer learning or Δ-learning and that it can also be applied to learn higher-fidelity up to the coupled-cluster level. We believe this methodology holds promise for creating highly accurate bespoke or universal MLIPs by effectively expanding the high-fidelity data set.","PeriodicalId":49,"journal":{"name":"Journal of the American Chemical Society","volume":"46 1","pages":""},"PeriodicalIF":14.4000,"publicationDate":"2024-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Data-Efficient Multifidelity Training for High-Fidelity Machine Learning Interatomic Potentials\",\"authors\":\"Jaesun Kim, Jisu Kim, Jaehoon Kim, Jiho Lee, Yutack Park, Youngho Kang, Seungwu Han\",\"doi\":\"10.1021/jacs.4c14455\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine learning interatomic potentials (MLIPs) are used to estimate potential energy surfaces (PES) from <i>ab initio</i> calculations, providing near-quantum-level accuracy with reduced computational costs. However, the high cost of assembling high-fidelity databases hampers the application of MLIPs to systems that require high chemical accuracy. Utilizing an equivariant graph neural network, we present an MLIP framework that trains on multifidelity databases simultaneously. This approach enables the accurate learning of high-fidelity PES with minimal high-fidelity data. Employing the generalized gradient approximation (GGA) and meta-GGA as low- and high-fidelity approaches, respectively, we tested this framework on the Li<sub>6</sub>PS<sub>5</sub>Cl and In<sub><i>x</i></sub>Ga<sub>1–<i>x</i></sub>N systems. The results show that using a high-fidelity training set with a size approximately 10% of the low-fidelity set, the multifidelity training framework achieves excellent accuracy, with Li-ion conductivity predictions within 10% error and In<sub><i>x</i></sub>Ga<sub>1–<i>x</i></sub>N mixing energy showing an <i>R</i><sup>2</sup> of 0.98 compared to the reference high-fidelity MLIP results. It indicates that geometric and compositional spaces not covered by the high-fidelity meta-GGA database can be effectively inferred from low-fidelity GGA data, thus enhancing accuracy and molecular dynamics stability. We also developed a general-purpose MLIP that utilizes both GGA and meta-GGA data from the Materials Project, significantly enhancing MLIP performance for high-accuracy tasks such as predicting energies above hull for crystals in general. Furthermore, we demonstrate that the present multifidelity learning is more effective than transfer learning or Δ-learning and that it can also be applied to learn higher-fidelity up to the coupled-cluster level. We believe this methodology holds promise for creating highly accurate bespoke or universal MLIPs by effectively expanding the high-fidelity data set.\",\"PeriodicalId\":49,\"journal\":{\"name\":\"Journal of the American Chemical Society\",\"volume\":\"46 1\",\"pages\":\"\"},\"PeriodicalIF\":14.4000,\"publicationDate\":\"2024-12-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the American Chemical Society\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1021/jacs.4c14455\",\"RegionNum\":1,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Chemical Society","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/jacs.4c14455","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
Data-Efficient Multifidelity Training for High-Fidelity Machine Learning Interatomic Potentials
Machine learning interatomic potentials (MLIPs) are used to estimate potential energy surfaces (PES) from ab initio calculations, providing near-quantum-level accuracy with reduced computational costs. However, the high cost of assembling high-fidelity databases hampers the application of MLIPs to systems that require high chemical accuracy. Utilizing an equivariant graph neural network, we present an MLIP framework that trains on multifidelity databases simultaneously. This approach enables the accurate learning of high-fidelity PES with minimal high-fidelity data. Employing the generalized gradient approximation (GGA) and meta-GGA as low- and high-fidelity approaches, respectively, we tested this framework on the Li6PS5Cl and InxGa1–xN systems. The results show that using a high-fidelity training set with a size approximately 10% of the low-fidelity set, the multifidelity training framework achieves excellent accuracy, with Li-ion conductivity predictions within 10% error and InxGa1–xN mixing energy showing an R2 of 0.98 compared to the reference high-fidelity MLIP results. It indicates that geometric and compositional spaces not covered by the high-fidelity meta-GGA database can be effectively inferred from low-fidelity GGA data, thus enhancing accuracy and molecular dynamics stability. We also developed a general-purpose MLIP that utilizes both GGA and meta-GGA data from the Materials Project, significantly enhancing MLIP performance for high-accuracy tasks such as predicting energies above hull for crystals in general. Furthermore, we demonstrate that the present multifidelity learning is more effective than transfer learning or Δ-learning and that it can also be applied to learn higher-fidelity up to the coupled-cluster level. We believe this methodology holds promise for creating highly accurate bespoke or universal MLIPs by effectively expanding the high-fidelity data set.
期刊介绍:
The flagship journal of the American Chemical Society, known as the Journal of the American Chemical Society (JACS), has been a prestigious publication since its establishment in 1879. It holds a preeminent position in the field of chemistry and related interdisciplinary sciences. JACS is committed to disseminating cutting-edge research papers, covering a wide range of topics, and encompasses approximately 19,000 pages of Articles, Communications, and Perspectives annually. With a weekly publication frequency, JACS plays a vital role in advancing the field of chemistry by providing essential research.