Sampling thermodynamic ensembles of molecular systems with generative neural networks: Will integrating physics-based models close the generalization gap?

IF 13.4 2区材料科学 Q1 MATERIALS SCIENCE, MULTIDISCIPLINARY

Current Opinion in Solid State & Materials Science Pub Date : 2024-04-06 DOI:10.1016/j.cossms.2024.101158

Grant M. Rotskoff

{"title":"Sampling thermodynamic ensembles of molecular systems with generative neural networks: Will integrating physics-based models close the generalization gap?","authors":"Grant M. Rotskoff","doi":"10.1016/j.cossms.2024.101158","DOIUrl":null,"url":null,"abstract":"<div><p>If the promise of generative modeling techniques is realized, it may fundamentally change how we carry out molecular simulation. The suite of techniques and models collectively termed “generative AI” includes many different classes of models built for varied types of data, from natural language to images. Recent advances in the machine learning literature that construct ever better generative models, though, do not contend with the challenges unique to complex, molecular systems. To generate a statistically likely molecular configuration, many correlated degrees of freedom must be sampled together, while also satisfying the strong constraints of chemical physics. Recent efforts to develop generative models for biomolecular systems have shown spectacular results in some cases—nevertheless, some simple systems remain out of reach with our present methodology. Arguably, the central concern is data efficiency: we should aim to train models that can meaningfully generalize beyond their training data and hence facilitate discovery. In this review, we discuss methods and future directions for directly incorporating physics-based models into generative neural networks, which we believe is a crucial step for addressing the limitations of the current toolkit.</p></div>","PeriodicalId":295,"journal":{"name":"Current Opinion in Solid State & Materials Science","volume":"30 ","pages":"Article 101158"},"PeriodicalIF":13.4000,"publicationDate":"2024-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Current Opinion in Solid State & Materials Science","FirstCategoryId":"88","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S135902862400024X","RegionNum":2,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATERIALS SCIENCE, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

If the promise of generative modeling techniques is realized, it may fundamentally change how we carry out molecular simulation. The suite of techniques and models collectively termed “generative AI” includes many different classes of models built for varied types of data, from natural language to images. Recent advances in the machine learning literature that construct ever better generative models, though, do not contend with the challenges unique to complex, molecular systems. To generate a statistically likely molecular configuration, many correlated degrees of freedom must be sampled together, while also satisfying the strong constraints of chemical physics. Recent efforts to develop generative models for biomolecular systems have shown spectacular results in some cases—nevertheless, some simple systems remain out of reach with our present methodology. Arguably, the central concern is data efficiency: we should aim to train models that can meaningfully generalize beyond their training data and hence facilitate discovery. In this review, we discuss methods and future directions for directly incorporating physics-based models into generative neural networks, which we believe is a crucial step for addressing the limitations of the current toolkit.

查看原文本刊更多论文

用生成式神经网络对分子系统的热力学集合进行采样：整合基于物理学的模型能否缩小泛化差距？

如果生成建模技术的前景得以实现，它将从根本上改变我们进行分子模拟的方式。统称为 "生成式人工智能 "的一整套技术和模型包括许多不同类别的模型，这些模型是针对从自然语言到图像等各种类型的数据而建立的。尽管机器学习文献的最新进展能够构建出更好的生成模型，但却无法应对复杂分子系统所特有的挑战。要生成统计学上可能的分子构型，必须同时对许多相关的自由度进行采样，同时还要满足化学物理的强大约束。最近为生物分子系统开发生成模型的努力在某些情况下取得了令人瞩目的成果--然而，一些简单的系统仍然是我们目前的方法所无法企及的。可以说，我们关注的核心问题是数据效率：我们的目标应该是训练出的模型能够在训练数据之外进行有意义的泛化，从而促进发现。在这篇综述中，我们讨论了将基于物理学的模型直接纳入生成式神经网络的方法和未来方向，我们认为这是解决当前工具包局限性的关键一步。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Current Opinion in Solid State & Materials Science 工程技术-材料科学：综合

CiteScore

21.10

自引率

3.60%

发文量

审稿时长

47 days

期刊介绍： Title: Current Opinion in Solid State & Materials Science Journal Overview: Aims to provide a snapshot of the latest research and advances in materials science Publishes six issues per year, each containing reviews covering exciting and developing areas of materials science Each issue comprises 2-3 sections of reviews commissioned by international researchers who are experts in their fields Provides materials scientists with the opportunity to stay informed about current developments in their own and related areas of research Promotes cross-fertilization of ideas across an increasingly interdisciplinary field