无约束学习的重要性：重新评估机器学习潜力在生成自由能谱时的不变和等变特征基准

arXiv - PHYS - Computational Physics Pub Date : 2024-08-28 DOI:arxiv-2408.16157

Gustavo R. Pérez-Lemus, Yinan Xu, Yezhi Jin, Pablo F. Zubieta Rico, Juan J. de Pablo

{"title":"无约束学习的重要性：重新评估机器学习潜力在生成自由能谱时的不变和等变特征基准","authors":"Gustavo R. Pérez-Lemus, Yinan Xu, Yezhi Jin, Pablo F. Zubieta Rico, Juan J. de Pablo","doi":"arxiv-2408.16157","DOIUrl":null,"url":null,"abstract":"Machine-learned interatomic potentials (MILPs) are rapidly gaining interest\nfor molecular modeling, as they provide a balance between quantum-mechanical\nlevel descriptions of atomic interactions and reasonable computational\nefficiency. However, questions remain regarding the stability of simulations\nusing these potentials, as well as the extent to which the learned potential\nenergy function can be extrapolated safely. Past studies have reported\nchallenges encountered when MILPs are applied to classical benchmark systems.\nIn this work, we show that some of these challenges are related to the\ncharacteristics of the training datasets, particularly the inclusion of rigid\nconstraints. We demonstrate that long stability in simulations with MILPs can\nbe achieved by generating unconstrained datasets using unbiased classical\nsimulations if the fast modes are correctly sampled. Additionally, we emphasize\nthat in order to achieve precise energy predictions, it is important to resort\nto enhanced sampling techniques for dataset generation, and we demonstrate that\nsafe extrapolation of MILPs depends on judicious choices related to the\nsystem's underlying free energy landscape and the symmetry features embedded\nwithin the machine learning models.","PeriodicalId":501369,"journal":{"name":"arXiv - PHYS - Computational Physics","volume":"17 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The Importance of Learning without Constraints: Reevaluating Benchmarks for Invariant and Equivariant Features of Machine Learning Potentials in Generating Free Energy Landscapes\",\"authors\":\"Gustavo R. Pérez-Lemus, Yinan Xu, Yezhi Jin, Pablo F. Zubieta Rico, Juan J. de Pablo\",\"doi\":\"arxiv-2408.16157\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine-learned interatomic potentials (MILPs) are rapidly gaining interest\\nfor molecular modeling, as they provide a balance between quantum-mechanical\\nlevel descriptions of atomic interactions and reasonable computational\\nefficiency. However, questions remain regarding the stability of simulations\\nusing these potentials, as well as the extent to which the learned potential\\nenergy function can be extrapolated safely. Past studies have reported\\nchallenges encountered when MILPs are applied to classical benchmark systems.\\nIn this work, we show that some of these challenges are related to the\\ncharacteristics of the training datasets, particularly the inclusion of rigid\\nconstraints. We demonstrate that long stability in simulations with MILPs can\\nbe achieved by generating unconstrained datasets using unbiased classical\\nsimulations if the fast modes are correctly sampled. Additionally, we emphasize\\nthat in order to achieve precise energy predictions, it is important to resort\\nto enhanced sampling techniques for dataset generation, and we demonstrate that\\nsafe extrapolation of MILPs depends on judicious choices related to the\\nsystem's underlying free energy landscape and the symmetry features embedded\\nwithin the machine learning models.\",\"PeriodicalId\":501369,\"journal\":{\"name\":\"arXiv - PHYS - Computational Physics\",\"volume\":\"17 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - PHYS - Computational Physics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.16157\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Computational Physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.16157","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

机器学习的原子间势能（MILPs）为分子建模提供了量子力学水平的原子相互作用描述与合理计算效率之间的平衡，因而迅速受到关注。然而，关于使用这些势能进行模拟的稳定性，以及在多大程度上可以安全地外推学习到的势能函数等问题依然存在。在这项研究中，我们发现其中一些挑战与训练数据集的特征有关，尤其是包含刚性约束的数据集。我们证明，如果能正确采样快速模式，使用无偏经典模拟生成无约束数据集，就能实现 MILPs 模拟的长期稳定性。此外，我们还强调，为了实现精确的能量预测，必须采用增强的采样技术来生成数据集，我们还证明了 MILPs 的安全外推取决于与系统的底层自由能景观和机器学习模型中嵌入的对称性特征有关的明智选择。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

The Importance of Learning without Constraints: Reevaluating Benchmarks for Invariant and Equivariant Features of Machine Learning Potentials in Generating Free Energy Landscapes

Machine-learned interatomic potentials (MILPs) are rapidly gaining interest for molecular modeling, as they provide a balance between quantum-mechanical level descriptions of atomic interactions and reasonable computational efficiency. However, questions remain regarding the stability of simulations using these potentials, as well as the extent to which the learned potential energy function can be extrapolated safely. Past studies have reported challenges encountered when MILPs are applied to classical benchmark systems. In this work, we show that some of these challenges are related to the characteristics of the training datasets, particularly the inclusion of rigid constraints. We demonstrate that long stability in simulations with MILPs can be achieved by generating unconstrained datasets using unbiased classical simulations if the fast modes are correctly sampled. Additionally, we emphasize that in order to achieve precise energy predictions, it is important to resort to enhanced sampling techniques for dataset generation, and we demonstrate that safe extrapolation of MILPs depends on judicious choices related to the system's underlying free energy landscape and the symmetry features embedded within the machine learning models.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - PHYS - Computational Physics

自引率

0.00%

发文量