Orlando A Mendible-Barreto, Jonathan K Whitmer, Yamil J Colón
{"title":"在自由能计算中使用机器学习力场的考虑。","authors":"Orlando A Mendible-Barreto, Jonathan K Whitmer, Yamil J Colón","doi":"10.1063/5.0252043","DOIUrl":null,"url":null,"abstract":"<p><p>Machine learning force fields (MLFFs) promise to accurately describe the potential energy surface of molecules at the ab initio level of theory with improved computational efficiency. Within MLFFs, equivariant graph neural networks (EQNNs) have shown great promise in accuracy and performance and are the focus of this work. The capability of EQNNs to recover free energy surfaces (FES) remains to be thoroughly investigated. In this work, we investigate the impact of collective variables (CVs) distribution within the training data on the accuracy of EQNNs predicting the FES of butane and alanine dipeptide. A generalizable workflow is presented in which training configurations are generated with classical molecular dynamics simulations, and energies and forces are obtained with ab initio calculations. We evaluate how bond and angle constraints in the training data influence the accuracy of EQNN force fields in reproducing the FES of the molecules at both classical and ab initio levels of theory. Results indicate that the model's accuracy is unaffected by the distribution of sampled CVs during training, given that the training data includes configurations from characteristic regions of the system's FES. However, when the training data is obtained from classical simulations, the EQNN struggles to extrapolate the free energy for configurations with high free energy. In contrast, models trained with the same configurations on ab initio data show improved extrapolation accuracy. The findings underscore the difficulties in creating a comprehensive training dataset for EQNNs to predict FESs and highlight the importance of prior knowledge of the system's FES.</p>","PeriodicalId":15313,"journal":{"name":"Journal of Chemical Physics","volume":"162 17","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2025-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Considerations in the use of machine learning force fields for free energy calculations.\",\"authors\":\"Orlando A Mendible-Barreto, Jonathan K Whitmer, Yamil J Colón\",\"doi\":\"10.1063/5.0252043\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Machine learning force fields (MLFFs) promise to accurately describe the potential energy surface of molecules at the ab initio level of theory with improved computational efficiency. Within MLFFs, equivariant graph neural networks (EQNNs) have shown great promise in accuracy and performance and are the focus of this work. The capability of EQNNs to recover free energy surfaces (FES) remains to be thoroughly investigated. In this work, we investigate the impact of collective variables (CVs) distribution within the training data on the accuracy of EQNNs predicting the FES of butane and alanine dipeptide. A generalizable workflow is presented in which training configurations are generated with classical molecular dynamics simulations, and energies and forces are obtained with ab initio calculations. We evaluate how bond and angle constraints in the training data influence the accuracy of EQNN force fields in reproducing the FES of the molecules at both classical and ab initio levels of theory. Results indicate that the model's accuracy is unaffected by the distribution of sampled CVs during training, given that the training data includes configurations from characteristic regions of the system's FES. However, when the training data is obtained from classical simulations, the EQNN struggles to extrapolate the free energy for configurations with high free energy. In contrast, models trained with the same configurations on ab initio data show improved extrapolation accuracy. The findings underscore the difficulties in creating a comprehensive training dataset for EQNNs to predict FESs and highlight the importance of prior knowledge of the system's FES.</p>\",\"PeriodicalId\":15313,\"journal\":{\"name\":\"Journal of Chemical Physics\",\"volume\":\"162 17\",\"pages\":\"\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-05-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Chemical Physics\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1063/5.0252043\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"CHEMISTRY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical Physics","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1063/5.0252043","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
Considerations in the use of machine learning force fields for free energy calculations.
Machine learning force fields (MLFFs) promise to accurately describe the potential energy surface of molecules at the ab initio level of theory with improved computational efficiency. Within MLFFs, equivariant graph neural networks (EQNNs) have shown great promise in accuracy and performance and are the focus of this work. The capability of EQNNs to recover free energy surfaces (FES) remains to be thoroughly investigated. In this work, we investigate the impact of collective variables (CVs) distribution within the training data on the accuracy of EQNNs predicting the FES of butane and alanine dipeptide. A generalizable workflow is presented in which training configurations are generated with classical molecular dynamics simulations, and energies and forces are obtained with ab initio calculations. We evaluate how bond and angle constraints in the training data influence the accuracy of EQNN force fields in reproducing the FES of the molecules at both classical and ab initio levels of theory. Results indicate that the model's accuracy is unaffected by the distribution of sampled CVs during training, given that the training data includes configurations from characteristic regions of the system's FES. However, when the training data is obtained from classical simulations, the EQNN struggles to extrapolate the free energy for configurations with high free energy. In contrast, models trained with the same configurations on ab initio data show improved extrapolation accuracy. The findings underscore the difficulties in creating a comprehensive training dataset for EQNNs to predict FESs and highlight the importance of prior knowledge of the system's FES.
期刊介绍:
The Journal of Chemical Physics publishes quantitative and rigorous science of long-lasting value in methods and applications of chemical physics. The Journal also publishes brief Communications of significant new findings, Perspectives on the latest advances in the field, and Special Topic issues. The Journal focuses on innovative research in experimental and theoretical areas of chemical physics, including spectroscopy, dynamics, kinetics, statistical mechanics, and quantum mechanics. In addition, topical areas such as polymers, soft matter, materials, surfaces/interfaces, and systems of biological relevance are of increasing importance.
Topical coverage includes:
Theoretical Methods and Algorithms
Advanced Experimental Techniques
Atoms, Molecules, and Clusters
Liquids, Glasses, and Crystals
Surfaces, Interfaces, and Materials
Polymers and Soft Matter
Biological Molecules and Networks.