在自由能计算中使用机器学习力场的考虑。

IF 3.1 2区 化学 Q3 CHEMISTRY, PHYSICAL
Orlando A Mendible-Barreto, Jonathan K Whitmer, Yamil J Colón
{"title":"在自由能计算中使用机器学习力场的考虑。","authors":"Orlando A Mendible-Barreto, Jonathan K Whitmer, Yamil J Colón","doi":"10.1063/5.0252043","DOIUrl":null,"url":null,"abstract":"<p><p>Machine learning force fields (MLFFs) promise to accurately describe the potential energy surface of molecules at the ab initio level of theory with improved computational efficiency. Within MLFFs, equivariant graph neural networks (EQNNs) have shown great promise in accuracy and performance and are the focus of this work. The capability of EQNNs to recover free energy surfaces (FES) remains to be thoroughly investigated. In this work, we investigate the impact of collective variables (CVs) distribution within the training data on the accuracy of EQNNs predicting the FES of butane and alanine dipeptide. A generalizable workflow is presented in which training configurations are generated with classical molecular dynamics simulations, and energies and forces are obtained with ab initio calculations. We evaluate how bond and angle constraints in the training data influence the accuracy of EQNN force fields in reproducing the FES of the molecules at both classical and ab initio levels of theory. Results indicate that the model's accuracy is unaffected by the distribution of sampled CVs during training, given that the training data includes configurations from characteristic regions of the system's FES. However, when the training data is obtained from classical simulations, the EQNN struggles to extrapolate the free energy for configurations with high free energy. In contrast, models trained with the same configurations on ab initio data show improved extrapolation accuracy. The findings underscore the difficulties in creating a comprehensive training dataset for EQNNs to predict FESs and highlight the importance of prior knowledge of the system's FES.</p>","PeriodicalId":15313,"journal":{"name":"Journal of Chemical Physics","volume":"162 17","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2025-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Considerations in the use of machine learning force fields for free energy calculations.\",\"authors\":\"Orlando A Mendible-Barreto, Jonathan K Whitmer, Yamil J Colón\",\"doi\":\"10.1063/5.0252043\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Machine learning force fields (MLFFs) promise to accurately describe the potential energy surface of molecules at the ab initio level of theory with improved computational efficiency. Within MLFFs, equivariant graph neural networks (EQNNs) have shown great promise in accuracy and performance and are the focus of this work. The capability of EQNNs to recover free energy surfaces (FES) remains to be thoroughly investigated. In this work, we investigate the impact of collective variables (CVs) distribution within the training data on the accuracy of EQNNs predicting the FES of butane and alanine dipeptide. A generalizable workflow is presented in which training configurations are generated with classical molecular dynamics simulations, and energies and forces are obtained with ab initio calculations. We evaluate how bond and angle constraints in the training data influence the accuracy of EQNN force fields in reproducing the FES of the molecules at both classical and ab initio levels of theory. Results indicate that the model's accuracy is unaffected by the distribution of sampled CVs during training, given that the training data includes configurations from characteristic regions of the system's FES. However, when the training data is obtained from classical simulations, the EQNN struggles to extrapolate the free energy for configurations with high free energy. In contrast, models trained with the same configurations on ab initio data show improved extrapolation accuracy. The findings underscore the difficulties in creating a comprehensive training dataset for EQNNs to predict FESs and highlight the importance of prior knowledge of the system's FES.</p>\",\"PeriodicalId\":15313,\"journal\":{\"name\":\"Journal of Chemical Physics\",\"volume\":\"162 17\",\"pages\":\"\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-05-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Chemical Physics\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1063/5.0252043\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"CHEMISTRY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical Physics","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1063/5.0252043","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0

摘要

机器学习力场(MLFFs)有望在从头算的理论水平上准确地描述分子的势能表面,并提高计算效率。在mlff中,等变图神经网络(equvariant graph neural networks, EQNNs)在准确性和性能方面显示出巨大的潜力,是本研究的重点。eqnn恢复自由能面(FES)的能力仍有待深入研究。在这项工作中,我们研究了训练数据中的集体变量(CVs)分布对eqnn预测丁烷和丙氨酸二肽FES准确性的影响。提出了一种通用的工作流程,通过经典分子动力学模拟生成训练构型,并通过从头计算获得能量和力。我们评估了训练数据中的键约束和角度约束如何影响EQNN力场在经典和从头算两种理论水平上再现分子FES的准确性。结果表明,考虑到训练数据包含系统FES特征区域的配置,模型的准确性不受训练过程中采样cv分布的影响。然而,当从经典模拟中获得训练数据时,EQNN很难推断出具有高自由能的构型的自由能。相比之下,在从头算数据上使用相同配置训练的模型显示出更高的外推精度。这些发现强调了为eqnn创建一个全面的训练数据集来预测FES的困难,并强调了系统FES先验知识的重要性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Considerations in the use of machine learning force fields for free energy calculations.

Machine learning force fields (MLFFs) promise to accurately describe the potential energy surface of molecules at the ab initio level of theory with improved computational efficiency. Within MLFFs, equivariant graph neural networks (EQNNs) have shown great promise in accuracy and performance and are the focus of this work. The capability of EQNNs to recover free energy surfaces (FES) remains to be thoroughly investigated. In this work, we investigate the impact of collective variables (CVs) distribution within the training data on the accuracy of EQNNs predicting the FES of butane and alanine dipeptide. A generalizable workflow is presented in which training configurations are generated with classical molecular dynamics simulations, and energies and forces are obtained with ab initio calculations. We evaluate how bond and angle constraints in the training data influence the accuracy of EQNN force fields in reproducing the FES of the molecules at both classical and ab initio levels of theory. Results indicate that the model's accuracy is unaffected by the distribution of sampled CVs during training, given that the training data includes configurations from characteristic regions of the system's FES. However, when the training data is obtained from classical simulations, the EQNN struggles to extrapolate the free energy for configurations with high free energy. In contrast, models trained with the same configurations on ab initio data show improved extrapolation accuracy. The findings underscore the difficulties in creating a comprehensive training dataset for EQNNs to predict FESs and highlight the importance of prior knowledge of the system's FES.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Chemical Physics
Journal of Chemical Physics 物理-物理:原子、分子和化学物理
CiteScore
7.40
自引率
15.90%
发文量
1615
审稿时长
2 months
期刊介绍: The Journal of Chemical Physics publishes quantitative and rigorous science of long-lasting value in methods and applications of chemical physics. The Journal also publishes brief Communications of significant new findings, Perspectives on the latest advances in the field, and Special Topic issues. The Journal focuses on innovative research in experimental and theoretical areas of chemical physics, including spectroscopy, dynamics, kinetics, statistical mechanics, and quantum mechanics. In addition, topical areas such as polymers, soft matter, materials, surfaces/interfaces, and systems of biological relevance are of increasing importance. Topical coverage includes: Theoretical Methods and Algorithms Advanced Experimental Techniques Atoms, Molecules, and Clusters Liquids, Glasses, and Crystals Surfaces, Interfaces, and Materials Polymers and Soft Matter Biological Molecules and Networks.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信