Keila Voortman-Sheetz, James O Wrabl, Vincent J Hilser
{"title":"Impact of local unfolding fluctuations on the evolution of regional sequence preferences in proteins.","authors":"Keila Voortman-Sheetz, James O Wrabl, Vincent J Hilser","doi":"10.1002/pro.70015","DOIUrl":null,"url":null,"abstract":"<p><p>The number of distinct structural environments in the proteome (as observed in the Protein Data Bank) may belie an organizing framework, whereby evolution conserves the relative stability of different sequence segments, regardless of the specific structural details present in the final fold. If true, the question arises as to whether the energetic consequences of amino acid substitutions, and thus the frequencies of amino acids within each of these so-called thermodynamic environments, could depend less on what local structure that sequence segment may adopt in the final fold, and more on the local stability of that final structure relative to the unfolded state. To address this question, a previously described ensemble-based approach (the COREX algorithm) was used to define proteins in terms of thermodynamic environments, and the naturally occurring frequencies of amino acids within these environments were used to generate statistical energies (a type of knowledge-based potential). By comparing compatibility scores from the statistical energies with energies calculated using the Rosetta all-atom energy function, we assessed the information overlap between the two approaches. Results revealed a substantial correlation between the statistical scores and those obtained using Rosetta, directly demonstrating that a small number of thermodynamic environments are sufficient to capture the perceived multiplicity of different structural environments in proteins. More importantly, the agreement suggests that regional amino acid distributions within each protein in any proteome have been substantially driven by the evolutionary conservation of the regional differences in stabilities within protein families.</p>","PeriodicalId":20761,"journal":{"name":"Protein Science","volume":"34 3","pages":"e70015"},"PeriodicalIF":4.5000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11837041/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Protein Science","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1002/pro.70015","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
The number of distinct structural environments in the proteome (as observed in the Protein Data Bank) may belie an organizing framework, whereby evolution conserves the relative stability of different sequence segments, regardless of the specific structural details present in the final fold. If true, the question arises as to whether the energetic consequences of amino acid substitutions, and thus the frequencies of amino acids within each of these so-called thermodynamic environments, could depend less on what local structure that sequence segment may adopt in the final fold, and more on the local stability of that final structure relative to the unfolded state. To address this question, a previously described ensemble-based approach (the COREX algorithm) was used to define proteins in terms of thermodynamic environments, and the naturally occurring frequencies of amino acids within these environments were used to generate statistical energies (a type of knowledge-based potential). By comparing compatibility scores from the statistical energies with energies calculated using the Rosetta all-atom energy function, we assessed the information overlap between the two approaches. Results revealed a substantial correlation between the statistical scores and those obtained using Rosetta, directly demonstrating that a small number of thermodynamic environments are sufficient to capture the perceived multiplicity of different structural environments in proteins. More importantly, the agreement suggests that regional amino acid distributions within each protein in any proteome have been substantially driven by the evolutionary conservation of the regional differences in stabilities within protein families.
期刊介绍:
Protein Science, the flagship journal of The Protein Society, is a publication that focuses on advancing fundamental knowledge in the field of protein molecules. The journal welcomes original reports and review articles that contribute to our understanding of protein function, structure, folding, design, and evolution.
Additionally, Protein Science encourages papers that explore the applications of protein science in various areas such as therapeutics, protein-based biomaterials, bionanotechnology, synthetic biology, and bioelectronics.
The journal accepts manuscript submissions in any suitable format for review, with the requirement of converting the manuscript to journal-style format only upon acceptance for publication.
Protein Science is indexed and abstracted in numerous databases, including the Agricultural & Environmental Science Database (ProQuest), Biological Science Database (ProQuest), CAS: Chemical Abstracts Service (ACS), Embase (Elsevier), Health & Medical Collection (ProQuest), Health Research Premium Collection (ProQuest), Materials Science & Engineering Database (ProQuest), MEDLINE/PubMed (NLM), Natural Science Collection (ProQuest), and SciTech Premium Collection (ProQuest).