Léa Bou Dagher, Dominique Madern, Philippe Malbos, Céline Brochier-Armanet
{"title":"Faithful Interpretation of Protein Structures through Weighted Persistent Homology Improves Evolutionary Distance Estimation.","authors":"Léa Bou Dagher, Dominique Madern, Philippe Malbos, Céline Brochier-Armanet","doi":"10.1093/molbev/msae271","DOIUrl":null,"url":null,"abstract":"<p><p>Phylogenetic inference is mainly based on sequence analysis and requires reliable alignments. This can be challenging, especially when sequences are highly divergent. In this context, the use of three-dimensional protein structures is a promising alternative. In a recent study, we introduced an original topological data analysis method based on persistent homology to estimate the evolutionary distances from structures. The method was successfully tested on 518 protein families representing 22,940 predicted structures. However, as anticipated, the reliability of the estimated evolutionary distances was impacted by the quality of the predicted structures and the presence of indels in the proteins. This paper introduces a new topological descriptor, called bio-topological marker (BTM), which provides a more faithful description of the structures, a topological analysis for estimating evolutionary distances from BTMs, and a new weight-filtering method adapted to protein structures. These new developments significantly improve the estimation of evolutionary distances and phylogenies inferred from structures.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":" ","pages":""},"PeriodicalIF":11.0000,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11789942/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular biology and evolution","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/molbev/msae271","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Phylogenetic inference is mainly based on sequence analysis and requires reliable alignments. This can be challenging, especially when sequences are highly divergent. In this context, the use of three-dimensional protein structures is a promising alternative. In a recent study, we introduced an original topological data analysis method based on persistent homology to estimate the evolutionary distances from structures. The method was successfully tested on 518 protein families representing 22,940 predicted structures. However, as anticipated, the reliability of the estimated evolutionary distances was impacted by the quality of the predicted structures and the presence of indels in the proteins. This paper introduces a new topological descriptor, called bio-topological marker (BTM), which provides a more faithful description of the structures, a topological analysis for estimating evolutionary distances from BTMs, and a new weight-filtering method adapted to protein structures. These new developments significantly improve the estimation of evolutionary distances and phylogenies inferred from structures.
期刊介绍:
Molecular Biology and Evolution
Journal Overview:
Publishes research at the interface of molecular (including genomics) and evolutionary biology
Considers manuscripts containing patterns, processes, and predictions at all levels of organization: population, taxonomic, functional, and phenotypic
Interested in fundamental discoveries, new and improved methods, resources, technologies, and theories advancing evolutionary research
Publishes balanced reviews of recent developments in genome evolution and forward-looking perspectives suggesting future directions in molecular evolution applications.