{"title":"A General Substitution Matrix for Structural Phylogenetics.","authors":"Sriram G Garg, Georg K A Hochberg","doi":"10.1093/molbev/msaf124","DOIUrl":null,"url":null,"abstract":"<p><p>Sequence-based maximum likelihood phylogenetics is a widely used method for inferring evolutionary relationships, which has illuminated the evolutionary histories of proteins and the organisms that harbor them. However, modern implementations with sophisticated models of sequence evolution struggle to resolve deep evolutionary relationships, which can be obscured by excessive sequence divergence and substitution saturation. Structural phylogenetics has emerged as a promising alternative because protein structure evolves much more slowly than protein sequences. Recent developments in protein structure prediction using AI have made it possible to predict protein structures for entire protein families and then to translate these structures into a sequence representation-the 3Di structural alphabet-that can in theory be directly fed into existing sequence-based phylogenetic software. To unlock the full potential of this idea, however, requires the inference of a general substitution matrix for structural phylogenetics, which has so far been missing. Here, we infer this matrix from large datasets of protein structures and show that it results in a better fit to empirical datasets than previous approaches. We then use this matrix to re-visit the question of the root of the tree of life. Using structural phylogenies of universal paralogs, we provide the first unambiguous evidence for a root between archaea and bacteria. Finally, we discuss some practical and conceptual limitations of structural phylogenetics. Our 3Di substitution matrix provides a starting point for revisiting many deep phylogenetic problems that have so far been extremely difficult to solve.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":" ","pages":""},"PeriodicalIF":11.0000,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12198762/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular biology and evolution","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/molbev/msaf124","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Sequence-based maximum likelihood phylogenetics is a widely used method for inferring evolutionary relationships, which has illuminated the evolutionary histories of proteins and the organisms that harbor them. However, modern implementations with sophisticated models of sequence evolution struggle to resolve deep evolutionary relationships, which can be obscured by excessive sequence divergence and substitution saturation. Structural phylogenetics has emerged as a promising alternative because protein structure evolves much more slowly than protein sequences. Recent developments in protein structure prediction using AI have made it possible to predict protein structures for entire protein families and then to translate these structures into a sequence representation-the 3Di structural alphabet-that can in theory be directly fed into existing sequence-based phylogenetic software. To unlock the full potential of this idea, however, requires the inference of a general substitution matrix for structural phylogenetics, which has so far been missing. Here, we infer this matrix from large datasets of protein structures and show that it results in a better fit to empirical datasets than previous approaches. We then use this matrix to re-visit the question of the root of the tree of life. Using structural phylogenies of universal paralogs, we provide the first unambiguous evidence for a root between archaea and bacteria. Finally, we discuss some practical and conceptual limitations of structural phylogenetics. Our 3Di substitution matrix provides a starting point for revisiting many deep phylogenetic problems that have so far been extremely difficult to solve.
期刊介绍:
Molecular Biology and Evolution
Journal Overview:
Publishes research at the interface of molecular (including genomics) and evolutionary biology
Considers manuscripts containing patterns, processes, and predictions at all levels of organization: population, taxonomic, functional, and phenotypic
Interested in fundamental discoveries, new and improved methods, resources, technologies, and theories advancing evolutionary research
Publishes balanced reviews of recent developments in genome evolution and forward-looking perspectives suggesting future directions in molecular evolution applications.