{"title":"Detecting Introgression in Shallow Phylogenies: How Minor Molecular Clock Deviations Lead to Major Inference Errors.","authors":"Xiao-Xu Pang, Jianquan Liu, Da-Yong Zhang","doi":"10.1093/molbev/msaf216","DOIUrl":null,"url":null,"abstract":"<p><p>Recent theoretical and algorithmic advances in introgression detection, coupled with the growing availability of genome-scale data, have highlighted the widespread occurrence of interspecific gene flow across the tree of life. However, current methods largely depend on the molecular clock assumption-a questionable premise given empirical evidence of substitution rate variation across lineages. While such rate heterogeneity is known to compromise gene flow detection among divergent lineages, its impact on closely related taxa at shallow evolutionary timescales remains poorly understood, likely because these taxa are often assumed to adhere to a molecular clock. To address this gap, we combine theoretical analyses and simulations to evaluate the robustness of widely used site pattern methods (D-statistic and HyDe) to rate variation across phylogenetic timescales. Our results demonstrate that both methods exhibit high sensitivity to even minor deviations from the molecular clock at shallow timescales, complementing previous findings at deeper scales. Specifically, in young phylogenies (with an age of 3 × 105 generations) with small population sizes, weak (17% difference) and moderate (33% difference) rate variation can inflate false-positive rates up to 35% and 100%, respectively, using site pattern counts from a 500 Mb genome. Employing a more distant outgroup intensifies these spurious signals. Our study demonstrates that summary tests for introgression are pervasively vulnerable to minor rate variations and underscores the critical need for advanced methodologies to disentangle genuine introgression from false signals generated by rate heterogeneity.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":" ","pages":""},"PeriodicalIF":5.3000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12485364/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular biology and evolution","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/molbev/msaf216","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Recent theoretical and algorithmic advances in introgression detection, coupled with the growing availability of genome-scale data, have highlighted the widespread occurrence of interspecific gene flow across the tree of life. However, current methods largely depend on the molecular clock assumption-a questionable premise given empirical evidence of substitution rate variation across lineages. While such rate heterogeneity is known to compromise gene flow detection among divergent lineages, its impact on closely related taxa at shallow evolutionary timescales remains poorly understood, likely because these taxa are often assumed to adhere to a molecular clock. To address this gap, we combine theoretical analyses and simulations to evaluate the robustness of widely used site pattern methods (D-statistic and HyDe) to rate variation across phylogenetic timescales. Our results demonstrate that both methods exhibit high sensitivity to even minor deviations from the molecular clock at shallow timescales, complementing previous findings at deeper scales. Specifically, in young phylogenies (with an age of 3 × 105 generations) with small population sizes, weak (17% difference) and moderate (33% difference) rate variation can inflate false-positive rates up to 35% and 100%, respectively, using site pattern counts from a 500 Mb genome. Employing a more distant outgroup intensifies these spurious signals. Our study demonstrates that summary tests for introgression are pervasively vulnerable to minor rate variations and underscores the critical need for advanced methodologies to disentangle genuine introgression from false signals generated by rate heterogeneity.
期刊介绍:
Molecular Biology and Evolution
Journal Overview:
Publishes research at the interface of molecular (including genomics) and evolutionary biology
Considers manuscripts containing patterns, processes, and predictions at all levels of organization: population, taxonomic, functional, and phenotypic
Interested in fundamental discoveries, new and improved methods, resources, technologies, and theories advancing evolutionary research
Publishes balanced reviews of recent developments in genome evolution and forward-looking perspectives suggesting future directions in molecular evolution applications.