Computation of Protein Thermostability and Epistasis

IF 27 2区化学 Q1 CHEMISTRY, MULTIDISCIPLINARY

Wiley Interdisciplinary Reviews: Computational Molecular Science Pub Date : 2025-09-18 DOI:10.1002/wcms.70045

Francesca Peccati, Cristina M. Segovia, Reyes Núñez-Franco, Gonzalo Jiménez-Osés

{"title":"Computation of Protein Thermostability and Epistasis","authors":"Francesca Peccati, Cristina M. Segovia, Reyes Núñez-Franco, Gonzalo Jiménez-Osés","doi":"10.1002/wcms.70045","DOIUrl":null,"url":null,"abstract":"<p>The ability to computationally predict changes in protein thermostability upon mutation is crucial for advancing protein design and engineering, with applications ranging from therapeutics to biocatalysis. This review provides a comprehensive overview of the significant challenges and diverse computational strategies for predicting protein stability and understanding epistatic interactions across protein variants. A primary obstacle to this goal is the scarcity of high-quality, large-scale thermodynamic datasets, which are often biased toward single-point, destabilizing mutations and lack standardized experimental metrics. This limitation directly impacts the performance and generalizability of data-driven methods, from early machine learning approaches to modern deep learning architectures such as ThermoMPNN and protein language models. Physics-based approaches, such as those employing Rosetta and FoldX energy functions, offer valuable insights but are often limited by their reliance on static structures and oversimplified representations of the unfolded state. While molecular dynamics simulations can capture the critical role of protein flexibility and dynamics in thermostabilization, their computational cost restricts their application in high-throughput screening. Accurately predicting the effects of multiple mutations is further complicated by epistasis, where nonadditive interactions can significantly alter stability and function. Overcoming these hurdles requires a synergistic approach, integrating AI-driven predictions with physics-based simulations and accurate conformational sampling methods. Promising future directions include the development of more comprehensive and unbiased datasets, and improved modeling of epistasis and the (un)folded states and their ensembles. Such advancements are essential for enhancing the reliability of thermostability predictions and navigating the complex stability–activity trade-offs inherent in protein optimization and design.</p><p>This article is categorized under:\n\n </p>","PeriodicalId":236,"journal":{"name":"Wiley Interdisciplinary Reviews: Computational Molecular Science","volume":"15 5","pages":""},"PeriodicalIF":27.0000,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://wires.onlinelibrary.wiley.com/doi/epdf/10.1002/wcms.70045","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Wiley Interdisciplinary Reviews: Computational Molecular Science","FirstCategoryId":"92","ListUrlMain":"https://wires.onlinelibrary.wiley.com/doi/10.1002/wcms.70045","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

The ability to computationally predict changes in protein thermostability upon mutation is crucial for advancing protein design and engineering, with applications ranging from therapeutics to biocatalysis. This review provides a comprehensive overview of the significant challenges and diverse computational strategies for predicting protein stability and understanding epistatic interactions across protein variants. A primary obstacle to this goal is the scarcity of high-quality, large-scale thermodynamic datasets, which are often biased toward single-point, destabilizing mutations and lack standardized experimental metrics. This limitation directly impacts the performance and generalizability of data-driven methods, from early machine learning approaches to modern deep learning architectures such as ThermoMPNN and protein language models. Physics-based approaches, such as those employing Rosetta and FoldX energy functions, offer valuable insights but are often limited by their reliance on static structures and oversimplified representations of the unfolded state. While molecular dynamics simulations can capture the critical role of protein flexibility and dynamics in thermostabilization, their computational cost restricts their application in high-throughput screening. Accurately predicting the effects of multiple mutations is further complicated by epistasis, where nonadditive interactions can significantly alter stability and function. Overcoming these hurdles requires a synergistic approach, integrating AI-driven predictions with physics-based simulations and accurate conformational sampling methods. Promising future directions include the development of more comprehensive and unbiased datasets, and improved modeling of epistasis and the (un)folded states and their ensembles. Such advancements are essential for enhancing the reliability of thermostability predictions and navigating the complex stability–activity trade-offs inherent in protein optimization and design.

This article is categorized under:

Abstract Image

查看原文本刊更多论文

蛋白质热稳定性和上位性的计算

计算预测突变后蛋白质热稳定性变化的能力对于推进蛋白质设计和工程至关重要，其应用范围从治疗学到生物催化。这篇综述全面概述了预测蛋白质稳定性和理解蛋白质变异之间的上位相互作用的重大挑战和不同的计算策略。实现这一目标的主要障碍是缺乏高质量、大规模的热力学数据集，这些数据集往往偏向于单点、不稳定的突变，并且缺乏标准化的实验指标。这种限制直接影响了数据驱动方法的性能和通用性，从早期的机器学习方法到现代深度学习架构（如ThermoMPNN和蛋白质语言模型）。基于物理的方法，如使用Rosetta和FoldX能量函数的方法，提供了有价值的见解，但往往受到静态结构和过度简化的展开状态表示的限制。虽然分子动力学模拟可以捕捉蛋白质柔韧性和动力学在热稳定中的关键作用，但其计算成本限制了其在高通量筛选中的应用。上位性使准确预测多重突变的影响变得更加复杂，其中非加性相互作用可以显著改变稳定性和功能。克服这些障碍需要协同的方法，将人工智能驱动的预测与基于物理的模拟和精确的构象采样方法相结合。有希望的未来方向包括开发更全面和无偏的数据集，以及改进上位性和（非）折叠态及其集合的建模。这些进步对于提高热稳定性预测的可靠性以及在蛋白质优化和设计中固有的复杂稳定性-活性权衡中导航至关重要。本文分类如下：

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Wiley Interdisciplinary Reviews: Computational Molecular Science CHEMISTRY, MULTIDISCIPLINARY-MATHEMATICAL & COMPUTATIONAL BIOLOGY

CiteScore

28.90

自引率

1.80%

发文量

审稿时长

6-12 weeks

期刊介绍： Computational molecular sciences harness the power of rigorous chemical and physical theories, employing computer-based modeling, specialized hardware, software development, algorithm design, and database management to explore and illuminate every facet of molecular sciences. These interdisciplinary approaches form a bridge between chemistry, biology, and materials sciences, establishing connections with adjacent application-driven fields in both chemistry and biology. WIREs Computational Molecular Science stands as a platform to comprehensively review and spotlight research from these dynamic and interconnected fields.