Elucidating Protein Dynamics through the Optimal Annealing of Variational Autoencoders.

IF 5.7 1区化学 Q2 CHEMISTRY, PHYSICAL

Journal of Chemical Theory and Computation Pub Date : 2025-06-16 DOI:10.1021/acs.jctc.5c00365

Subinoy Adhikari, Jagannath Mondal

{"title":"Elucidating Protein Dynamics through the Optimal Annealing of Variational Autoencoders.","authors":"Subinoy Adhikari, Jagannath Mondal","doi":"10.1021/acs.jctc.5c00365","DOIUrl":null,"url":null,"abstract":"<p><p>Proteins traverse intricate conformational landscapes with transitions and long-lived states that hold the key to their biological function. However, unraveling these dynamics remains a formidable challenge. An emerging approach has been to train the conformational ensemble via deep Variational autoencoders (VAEs) in a bid to machine learn the underlying reduced-dimensional representation. However, training VAEs typically involves a fixed β value of 1, where β acts as the crucial weighing factor between the reconstruction and regularization terms. This static setup can often lead to posterior collapse, which significantly hinders the model's ability to capture complex protein dynamics accurately. To mitigate this issue, annealing the β parameter offers a potential alternative. However, this approach frequently falls short in fully addressing the problem, mainly due to the arbitrary choice of the upper bound of β and the annealing schedule. In this work, we propose a new approach for selecting the β parameter by utilizing the Fraction of variation explained (FVE) score to identify its optimal value. We demonstrate that training annealed VAEs at their optimum β in a single cycle consistently outperformed their nonannealed counterparts, as evident from their higher variational approach for Markov processes-2 and generalized matrix Rayleigh quotient scores and distinct free energy surface minima on both folded and intrinsically disordered proteins. The improved latent space representations significantly improve state space discretization, thereby refining Markov State Models and providing more accurate insights into conformational landscapes, as reflected in distinct contact maps. Together, this development provides a systematic approach to optimizing the balance between reconstruction and regularization aspects of VAEs that would augment the potential of annealed VAEs in resolving complex conformational landscapes.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":""},"PeriodicalIF":5.7000,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical Theory and Computation","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/acs.jctc.5c00365","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Proteins traverse intricate conformational landscapes with transitions and long-lived states that hold the key to their biological function. However, unraveling these dynamics remains a formidable challenge. An emerging approach has been to train the conformational ensemble via deep Variational autoencoders (VAEs) in a bid to machine learn the underlying reduced-dimensional representation. However, training VAEs typically involves a fixed β value of 1, where β acts as the crucial weighing factor between the reconstruction and regularization terms. This static setup can often lead to posterior collapse, which significantly hinders the model's ability to capture complex protein dynamics accurately. To mitigate this issue, annealing the β parameter offers a potential alternative. However, this approach frequently falls short in fully addressing the problem, mainly due to the arbitrary choice of the upper bound of β and the annealing schedule. In this work, we propose a new approach for selecting the β parameter by utilizing the Fraction of variation explained (FVE) score to identify its optimal value. We demonstrate that training annealed VAEs at their optimum β in a single cycle consistently outperformed their nonannealed counterparts, as evident from their higher variational approach for Markov processes-2 and generalized matrix Rayleigh quotient scores and distinct free energy surface minima on both folded and intrinsically disordered proteins. The improved latent space representations significantly improve state space discretization, thereby refining Markov State Models and providing more accurate insights into conformational landscapes, as reflected in distinct contact maps. Together, this development provides a systematic approach to optimizing the balance between reconstruction and regularization aspects of VAEs that would augment the potential of annealed VAEs in resolving complex conformational landscapes.

查看原文本刊更多论文

通过变分自编码器的最优退火来阐明蛋白质动力学。

蛋白质穿越复杂的构象景观，具有过渡和长期存在的状态，这是其生物功能的关键。然而，解开这些动态仍然是一项艰巨的挑战。一种新兴的方法是通过深度变分自编码器（VAEs）来训练构象集合，以机器学习底层的降维表示。然而，训练vae通常涉及一个固定的β值为1，其中β作为重建和正则化项之间的关键权衡因素。这种静态设置通常会导致后向塌陷，这极大地阻碍了模型准确捕捉复杂蛋白质动态的能力。为了缓解这个问题，退火β参数提供了一个潜在的替代方案。然而，这种方法往往不能完全解决问题，主要是由于β上界的任意选择和退火计划。在这项工作中，我们提出了一种新的方法来选择β参数，利用变异解释分数（FVE）评分来确定其最佳值。我们证明，在单周期内以最佳β值训练退火的vae始终优于非退火的vae，这一点从它们对马尔可夫过程的高变分方法-2和广义矩阵瑞利商分数以及折叠和内在无序蛋白质的明显自由能表面最小值中可以看出。改进的潜在空间表示显着改善了状态空间离散化，从而改进了马尔可夫状态模型，并提供了更准确的构象景观见解，反映在不同的接触图中。总之，这一发展提供了一种系统的方法来优化VAEs的重建和正则化方面的平衡，这将增加退火VAEs在解决复杂构象景观方面的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Chemical Theory and Computation 化学-物理：原子、分子和化学物理

CiteScore

9.90

自引率

16.40%

发文量

568

审稿时长

1 months

期刊介绍： The Journal of Chemical Theory and Computation invites new and original contributions with the understanding that, if accepted, they will not be published elsewhere. Papers reporting new theories, methodology, and/or important applications in quantum electronic structure, molecular dynamics, and statistical mechanics are appropriate for submission to this Journal. Specific topics include advances in or applications of ab initio quantum mechanics, density functional theory, design and properties of new materials, surface science, Monte Carlo simulations, solvation models, QM/MM calculations, biomolecular structure prediction, and molecular dynamics in the broadest sense including gas-phase dynamics, ab initio dynamics, biomolecular dynamics, and protein folding. The Journal does not consider papers that are straightforward applications of known methods including DFT and molecular dynamics. The Journal favors submissions that include advances in theory or methodology with applications to compelling problems.