Revisiting the Multispecies Coalescent Model fit with an example from a complete molecular phylogeny of the Liolaemus wiegmannii species group (Squamata: Liolaemidae).
Joaquín Villamil,Mariana Morando,Luciano J Avila,Flávia M Lanna,Emanuel M Fonseca,Jack W Sites,Arley Camargo
{"title":"Revisiting the Multispecies Coalescent Model fit with an example from a complete molecular phylogeny of the Liolaemus wiegmannii species group (Squamata: Liolaemidae).","authors":"Joaquín Villamil,Mariana Morando,Luciano J Avila,Flávia M Lanna,Emanuel M Fonseca,Jack W Sites,Arley Camargo","doi":"10.1093/sysbio/syaf048","DOIUrl":null,"url":null,"abstract":"Departures from the Multispecies Coalescent (MSC) assumptions could cause artefactual topologies and node height estimates, and therefore, trees inferred without MSC model fit testing could potentially misrepresent an accurate approximation of the evolutionary history of a group. The current implementation of MSC model testing for non-genomic level molecular markers cannot process trees estimated from BEAST 2, limiting its application for large datasets of sequence-based markers. Here we recode functions of the R package P2C2M to assess model fit to the MSC and apply this new implementation, which we named P2C2M2, to test the MSC model in a 16-loci dataset of 42 lizard species focused on the Liolaemus wiegmannii group. We found strong evidence of model departures in several loci, possibly due to historical gene flow, which could also be causing an unexpected position of the L. wiegmannii group within the L. montanus section of Eulaemus, when hybridization is not accounted for. The L. anomalus group is inferred as the closest to the L. wiegmannii group when gene flow is incorporated via a Multispecies Network Coalescent model, and a reticulation, suggesting historical gene flow between the L. wiegmannii and L. montanus groups is inferred, which has not been previously reported. We argue that there are at least three sources of discrepancy between the literature and the node ages estimated in our study: the use of strict molecular clocks without statistical justification, misplaced fossil calibrations, and the estimation of coalescent times instead of species divergence times. We encouraged systematists to routinely test the fit of the MSC model when estimating species trees using sequence-based markers, and to follow a phylogenetic network approach when both this test is significant and when historical gene flow is considered one plausible source of the departure from the MSC model.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":"697 1","pages":""},"PeriodicalIF":5.7000,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Systematic Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/sysbio/syaf048","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EVOLUTIONARY BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Departures from the Multispecies Coalescent (MSC) assumptions could cause artefactual topologies and node height estimates, and therefore, trees inferred without MSC model fit testing could potentially misrepresent an accurate approximation of the evolutionary history of a group. The current implementation of MSC model testing for non-genomic level molecular markers cannot process trees estimated from BEAST 2, limiting its application for large datasets of sequence-based markers. Here we recode functions of the R package P2C2M to assess model fit to the MSC and apply this new implementation, which we named P2C2M2, to test the MSC model in a 16-loci dataset of 42 lizard species focused on the Liolaemus wiegmannii group. We found strong evidence of model departures in several loci, possibly due to historical gene flow, which could also be causing an unexpected position of the L. wiegmannii group within the L. montanus section of Eulaemus, when hybridization is not accounted for. The L. anomalus group is inferred as the closest to the L. wiegmannii group when gene flow is incorporated via a Multispecies Network Coalescent model, and a reticulation, suggesting historical gene flow between the L. wiegmannii and L. montanus groups is inferred, which has not been previously reported. We argue that there are at least three sources of discrepancy between the literature and the node ages estimated in our study: the use of strict molecular clocks without statistical justification, misplaced fossil calibrations, and the estimation of coalescent times instead of species divergence times. We encouraged systematists to routinely test the fit of the MSC model when estimating species trees using sequence-based markers, and to follow a phylogenetic network approach when both this test is significant and when historical gene flow is considered one plausible source of the departure from the MSC model.
期刊介绍:
Systematic Biology is the bimonthly journal of the Society of Systematic Biologists. Papers for the journal are original contributions to the theory, principles, and methods of systematics as well as phylogeny, evolution, morphology, biogeography, paleontology, genetics, and the classification of all living things. A Points of View section offers a forum for discussion, while book reviews and announcements of general interest are also featured.