Jaesun Kim, Jisu Kim, Jaehoon Kim, Jiho Lee, Yutack Park, Youngho Kang, Seungwu Han
{"title":"为高保真机器学习原子间势提供数据高效的多保真训练","authors":"Jaesun Kim, Jisu Kim, Jaehoon Kim, Jiho Lee, Yutack Park, Youngho Kang, Seungwu Han","doi":"arxiv-2409.07947","DOIUrl":null,"url":null,"abstract":"Machine learning interatomic potentials (MLIPs) are used to estimate\npotential energy surfaces (PES) from ab initio calculations, providing near\nquantum-level accuracy with reduced computational costs. However, the high cost\nof assembling high-fidelity databases hampers the application of MLIPs to\nsystems that require high chemical accuracy. Utilizing an equivariant graph\nneural network, we present an MLIP framework that trains on multi-fidelity\ndatabases simultaneously. This approach enables the accurate learning of\nhigh-fidelity PES with minimal high-fidelity data. We test this framework on\nthe Li$_6$PS$_5$Cl and In$_x$Ga$_{1-x}$N systems. The computational results\nindicate that geometric and compositional spaces not covered by the\nhigh-fidelity meta-gradient generalized approximation (meta-GGA) database can\nbe effectively inferred from low-fidelity GGA data, thus enhancing accuracy and\nmolecular dynamics stability. We also develop a general-purpose MLIP that\nutilizes both GGA and meta-GGA data from the Materials Project, significantly\nenhancing MLIP performance for high-accuracy tasks such as predicting energies\nabove hull for crystals in general. Furthermore, we demonstrate that the\npresent multi-fidelity learning is more effective than transfer learning or\n$\\Delta$-learning an d that it can also be applied to learn higher-fidelity up\nto the coupled-cluster level. We believe this methodology holds promise for\ncreating highly accurate bespoke or universal MLIPs by effectively expanding\nthe high-fidelity dataset.","PeriodicalId":501234,"journal":{"name":"arXiv - PHYS - Materials Science","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Data-efficient multi-fidelity training for high-fidelity machine learning interatomic potentials\",\"authors\":\"Jaesun Kim, Jisu Kim, Jaehoon Kim, Jiho Lee, Yutack Park, Youngho Kang, Seungwu Han\",\"doi\":\"arxiv-2409.07947\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine learning interatomic potentials (MLIPs) are used to estimate\\npotential energy surfaces (PES) from ab initio calculations, providing near\\nquantum-level accuracy with reduced computational costs. However, the high cost\\nof assembling high-fidelity databases hampers the application of MLIPs to\\nsystems that require high chemical accuracy. Utilizing an equivariant graph\\nneural network, we present an MLIP framework that trains on multi-fidelity\\ndatabases simultaneously. This approach enables the accurate learning of\\nhigh-fidelity PES with minimal high-fidelity data. We test this framework on\\nthe Li$_6$PS$_5$Cl and In$_x$Ga$_{1-x}$N systems. The computational results\\nindicate that geometric and compositional spaces not covered by the\\nhigh-fidelity meta-gradient generalized approximation (meta-GGA) database can\\nbe effectively inferred from low-fidelity GGA data, thus enhancing accuracy and\\nmolecular dynamics stability. We also develop a general-purpose MLIP that\\nutilizes both GGA and meta-GGA data from the Materials Project, significantly\\nenhancing MLIP performance for high-accuracy tasks such as predicting energies\\nabove hull for crystals in general. Furthermore, we demonstrate that the\\npresent multi-fidelity learning is more effective than transfer learning or\\n$\\\\Delta$-learning an d that it can also be applied to learn higher-fidelity up\\nto the coupled-cluster level. We believe this methodology holds promise for\\ncreating highly accurate bespoke or universal MLIPs by effectively expanding\\nthe high-fidelity dataset.\",\"PeriodicalId\":501234,\"journal\":{\"name\":\"arXiv - PHYS - Materials Science\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - PHYS - Materials Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.07947\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Materials Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07947","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Data-efficient multi-fidelity training for high-fidelity machine learning interatomic potentials
Machine learning interatomic potentials (MLIPs) are used to estimate
potential energy surfaces (PES) from ab initio calculations, providing near
quantum-level accuracy with reduced computational costs. However, the high cost
of assembling high-fidelity databases hampers the application of MLIPs to
systems that require high chemical accuracy. Utilizing an equivariant graph
neural network, we present an MLIP framework that trains on multi-fidelity
databases simultaneously. This approach enables the accurate learning of
high-fidelity PES with minimal high-fidelity data. We test this framework on
the Li$_6$PS$_5$Cl and In$_x$Ga$_{1-x}$N systems. The computational results
indicate that geometric and compositional spaces not covered by the
high-fidelity meta-gradient generalized approximation (meta-GGA) database can
be effectively inferred from low-fidelity GGA data, thus enhancing accuracy and
molecular dynamics stability. We also develop a general-purpose MLIP that
utilizes both GGA and meta-GGA data from the Materials Project, significantly
enhancing MLIP performance for high-accuracy tasks such as predicting energies
above hull for crystals in general. Furthermore, we demonstrate that the
present multi-fidelity learning is more effective than transfer learning or
$\Delta$-learning an d that it can also be applied to learn higher-fidelity up
to the coupled-cluster level. We believe this methodology holds promise for
creating highly accurate bespoke or universal MLIPs by effectively expanding
the high-fidelity dataset.