The Importance of Learning without Constraints: Reevaluating Benchmarks for Invariant and Equivariant Features of Machine Learning Potentials in Generating Free Energy Landscapes
Gustavo R. Pérez-Lemus, Yinan Xu, Yezhi Jin, Pablo F. Zubieta Rico, Juan J. de Pablo
{"title":"The Importance of Learning without Constraints: Reevaluating Benchmarks for Invariant and Equivariant Features of Machine Learning Potentials in Generating Free Energy Landscapes","authors":"Gustavo R. Pérez-Lemus, Yinan Xu, Yezhi Jin, Pablo F. Zubieta Rico, Juan J. de Pablo","doi":"arxiv-2408.16157","DOIUrl":null,"url":null,"abstract":"Machine-learned interatomic potentials (MILPs) are rapidly gaining interest\nfor molecular modeling, as they provide a balance between quantum-mechanical\nlevel descriptions of atomic interactions and reasonable computational\nefficiency. However, questions remain regarding the stability of simulations\nusing these potentials, as well as the extent to which the learned potential\nenergy function can be extrapolated safely. Past studies have reported\nchallenges encountered when MILPs are applied to classical benchmark systems.\nIn this work, we show that some of these challenges are related to the\ncharacteristics of the training datasets, particularly the inclusion of rigid\nconstraints. We demonstrate that long stability in simulations with MILPs can\nbe achieved by generating unconstrained datasets using unbiased classical\nsimulations if the fast modes are correctly sampled. Additionally, we emphasize\nthat in order to achieve precise energy predictions, it is important to resort\nto enhanced sampling techniques for dataset generation, and we demonstrate that\nsafe extrapolation of MILPs depends on judicious choices related to the\nsystem's underlying free energy landscape and the symmetry features embedded\nwithin the machine learning models.","PeriodicalId":501369,"journal":{"name":"arXiv - PHYS - Computational Physics","volume":"17 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Computational Physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.16157","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Machine-learned interatomic potentials (MILPs) are rapidly gaining interest
for molecular modeling, as they provide a balance between quantum-mechanical
level descriptions of atomic interactions and reasonable computational
efficiency. However, questions remain regarding the stability of simulations
using these potentials, as well as the extent to which the learned potential
energy function can be extrapolated safely. Past studies have reported
challenges encountered when MILPs are applied to classical benchmark systems.
In this work, we show that some of these challenges are related to the
characteristics of the training datasets, particularly the inclusion of rigid
constraints. We demonstrate that long stability in simulations with MILPs can
be achieved by generating unconstrained datasets using unbiased classical
simulations if the fast modes are correctly sampled. Additionally, we emphasize
that in order to achieve precise energy predictions, it is important to resort
to enhanced sampling techniques for dataset generation, and we demonstrate that
safe extrapolation of MILPs depends on judicious choices related to the
system's underlying free energy landscape and the symmetry features embedded
within the machine learning models.