{"title":"Inference of Cross-Species Gene Flow Using Genomic Data Depends on the Methods: Case Study of Gene Flow in Drosophila.","authors":"Jiayi Ji,Thomas Roberts,Tomáš Flouri,Ziheng Yang","doi":"10.1093/sysbio/syaf019","DOIUrl":null,"url":null,"abstract":"Analysis of genomic data in the past two decades has highlighted the prevalence of introgression as an important evolutionary force in both plants and animals. The genus Drosophila has received much attention recently, with an analysis of genomic sequence data revealing widespread introgression across the species phylogeny for the genus. However, the methods used in the study are based on data summaries for species triplets and are unable to infer gene flow between sister lineages or to identify the direction of gene flow. Hence, we reanalyze a subset of the data using the Bayesian program bpp, which is a full-likelihood implementation of the multispecies coalescent model and can provide more powerful inference of gene flow between species, including its direction, timing, and strength. While our analysis supports the presence of gene flow in the species group, the results differ from the previous study: we infer gene flow between sister lineages undetected previously whereas most gene-flow events inferred in the previous study are rejected in our tests. To verify our conclusions, we performed simulations to examine the properties of Bayesian and summary methods. Bpp was found to have high power to detect gene flow, high accuracy in estimated rates of gene flow, and robustness under misspecification of the mode of gene flow. In contrast, summary methods had low power and produced biased estimates of introgression probability. Our results highlight an urgent need for improving the statistical properties of summary methods and the computational efficiency of likelihood methods for inferring gene flow using genomic sequence data.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":"12 1","pages":""},"PeriodicalIF":5.7000,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Systematic Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/sysbio/syaf019","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EVOLUTIONARY BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Analysis of genomic data in the past two decades has highlighted the prevalence of introgression as an important evolutionary force in both plants and animals. The genus Drosophila has received much attention recently, with an analysis of genomic sequence data revealing widespread introgression across the species phylogeny for the genus. However, the methods used in the study are based on data summaries for species triplets and are unable to infer gene flow between sister lineages or to identify the direction of gene flow. Hence, we reanalyze a subset of the data using the Bayesian program bpp, which is a full-likelihood implementation of the multispecies coalescent model and can provide more powerful inference of gene flow between species, including its direction, timing, and strength. While our analysis supports the presence of gene flow in the species group, the results differ from the previous study: we infer gene flow between sister lineages undetected previously whereas most gene-flow events inferred in the previous study are rejected in our tests. To verify our conclusions, we performed simulations to examine the properties of Bayesian and summary methods. Bpp was found to have high power to detect gene flow, high accuracy in estimated rates of gene flow, and robustness under misspecification of the mode of gene flow. In contrast, summary methods had low power and produced biased estimates of introgression probability. Our results highlight an urgent need for improving the statistical properties of summary methods and the computational efficiency of likelihood methods for inferring gene flow using genomic sequence data.
期刊介绍:
Systematic Biology is the bimonthly journal of the Society of Systematic Biologists. Papers for the journal are original contributions to the theory, principles, and methods of systematics as well as phylogeny, evolution, morphology, biogeography, paleontology, genetics, and the classification of all living things. A Points of View section offers a forum for discussion, while book reviews and announcements of general interest are also featured.