Saverio Rossi, Leonardo Di Bari, Martin Weigt, Francesco Zamponi
{"title":"Fluctuations and the limit of predictability in protein evolution.","authors":"Saverio Rossi, Leonardo Di Bari, Martin Weigt, Francesco Zamponi","doi":"10.1088/1361-6633/adea92","DOIUrl":null,"url":null,"abstract":"<p><p>Protein evolution involves mutations occurring across a wide range of time scales. In analogy with disordered systems in statistical physics, this dynamical heterogeneity suggests strong correlations between mutations happening at distinct sites and times. To quantify these correlations, we examine the role of various fluctuation sources in protein evolution, simulated using a data-driven energy landscape as a proxy for protein fitness. By applying spatio-temporal correlation functions developed in the context of disordered physical systems, we disentangle fluctuations originating from the initial condition, i.e. the ancestral sequence from which the evolutionary process originated, from those driven by stochastic mutations along independent evolutionary paths. Our analysis shows that, in diverse protein families, fluctuations from the ancestral sequence predominate at shorter time scales. This allows us to identify a time scale over which ancestral sequence information persists, enabling its reconstruction. We link this persistence to the strength of epistatic interactions: ancestral sequences with stronger epistatic signatures impact evolutionary trajectories over extended periods. At longer time scales, however, ancestral influence fades as epistatically constrained sites evolve collectively. To confirm this idea, we apply a standard ancestral sequence reconstruction (ASR) algorithm and verify that the time-dependent recovery error is influenced by the properties of the ancestor itself. Overall, our results reveal that the properties of ancestral sequences-particularly their epistatic constraints-influence the initial evolutionary dynamics and the performance of standard ASR algorithms.</p>","PeriodicalId":74666,"journal":{"name":"Reports on progress in physics. Physical Society (Great Britain)","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Reports on progress in physics. Physical Society (Great Britain)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/1361-6633/adea92","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Protein evolution involves mutations occurring across a wide range of time scales. In analogy with disordered systems in statistical physics, this dynamical heterogeneity suggests strong correlations between mutations happening at distinct sites and times. To quantify these correlations, we examine the role of various fluctuation sources in protein evolution, simulated using a data-driven energy landscape as a proxy for protein fitness. By applying spatio-temporal correlation functions developed in the context of disordered physical systems, we disentangle fluctuations originating from the initial condition, i.e. the ancestral sequence from which the evolutionary process originated, from those driven by stochastic mutations along independent evolutionary paths. Our analysis shows that, in diverse protein families, fluctuations from the ancestral sequence predominate at shorter time scales. This allows us to identify a time scale over which ancestral sequence information persists, enabling its reconstruction. We link this persistence to the strength of epistatic interactions: ancestral sequences with stronger epistatic signatures impact evolutionary trajectories over extended periods. At longer time scales, however, ancestral influence fades as epistatically constrained sites evolve collectively. To confirm this idea, we apply a standard ancestral sequence reconstruction (ASR) algorithm and verify that the time-dependent recovery error is influenced by the properties of the ancestor itself. Overall, our results reveal that the properties of ancestral sequences-particularly their epistatic constraints-influence the initial evolutionary dynamics and the performance of standard ASR algorithms.