Paul Fuchs, Stephan Thaler, Sebastien Röcken, Julija Zavadlav
{"title":"chemtrain: Learning Deep Potential Models via Automatic Differentiation and Statistical Physics","authors":"Paul Fuchs, Stephan Thaler, Sebastien Röcken, Julija Zavadlav","doi":"arxiv-2408.15852","DOIUrl":null,"url":null,"abstract":"Neural Networks (NNs) are promising models for refining the accuracy of\nmolecular dynamics, potentially opening up new fields of application. Typically\ntrained bottom-up, atomistic NN potential models can reach first-principle\naccuracy, while coarse-grained implicit solvent NN potentials surpass classical\ncontinuum solvent models. However, overcoming the limitations of costly\ngeneration of accurate reference data and data inefficiency of common bottom-up\ntraining demands efficient incorporation of data from many sources. This paper\nintroduces the framework chemtrain to learn sophisticated NN potential models\nthrough customizable training routines and advanced training algorithms. These\nroutines can combine multiple top-down and bottom-up algorithms, e.g., to\nincorporate both experimental and simulation data or pre-train potentials with\nless costly algorithms. chemtrain provides an object-oriented high-level\ninterface to simplify the creation of custom routines. On the lower level,\nchemtrain relies on JAX to compute gradients and scale the computations to use\navailable resources. We demonstrate the simplicity and importance of combining\nmultiple algorithms in the examples of parametrizing an all-atomistic model of\ntitanium and a coarse-grained implicit solvent model of alanine dipeptide.","PeriodicalId":501369,"journal":{"name":"arXiv - PHYS - Computational Physics","volume":"7 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Computational Physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.15852","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Neural Networks (NNs) are promising models for refining the accuracy of
molecular dynamics, potentially opening up new fields of application. Typically
trained bottom-up, atomistic NN potential models can reach first-principle
accuracy, while coarse-grained implicit solvent NN potentials surpass classical
continuum solvent models. However, overcoming the limitations of costly
generation of accurate reference data and data inefficiency of common bottom-up
training demands efficient incorporation of data from many sources. This paper
introduces the framework chemtrain to learn sophisticated NN potential models
through customizable training routines and advanced training algorithms. These
routines can combine multiple top-down and bottom-up algorithms, e.g., to
incorporate both experimental and simulation data or pre-train potentials with
less costly algorithms. chemtrain provides an object-oriented high-level
interface to simplify the creation of custom routines. On the lower level,
chemtrain relies on JAX to compute gradients and scale the computations to use
available resources. We demonstrate the simplicity and importance of combining
multiple algorithms in the examples of parametrizing an all-atomistic model of
titanium and a coarse-grained implicit solvent model of alanine dipeptide.
神经网络(NN)是提高分子动力学精确度的有前途的模型,有可能开辟新的应用领域。通常自下而上训练的原子论 NN 势模型可以达到第一原理精度,而粗粒度隐式溶剂 NN 势则超越了经典的连续介质模型。然而,要克服精确参考数据生成成本高昂和普通自下而上训练数据效率低的限制,就必须有效地整合多种来源的数据。本文介绍了 chemtrain 框架,通过可定制的训练程序和先进的训练算法来学习复杂的 NN 电位模型。这些训练程序可以结合多种自上而下和自下而上的算法,例如,结合实验数据和模拟数据,或使用低成本算法预训练电位。 chemtrain 提供了面向对象的高级接口,简化了自定义程序的创建。在底层,chemtrain 依靠 JAX 计算梯度,并根据可用资源的使用情况对计算进行扩展。我们以钛的全原子模型参数化和丙氨酸二肽的粗粒度隐式溶剂模型为例,展示了多种算法组合的简便性和重要性。