A nuclear phylogenomic tree of grasses (Poaceae) recovers current classification despite gene tree incongruence

IF 8.3 1区 生物学 Q1 PLANT SCIENCES
New Phytologist Pub Date : 2024-11-20 DOI:10.1111/nph.20263
{"title":"A nuclear phylogenomic tree of grasses (Poaceae) recovers current classification despite gene tree incongruence","authors":"","doi":"10.1111/nph.20263","DOIUrl":null,"url":null,"abstract":"<h2> Introduction</h2>\n<p>With almost 11 800 species in 791 genera (Soreng <i>et al</i>., <span>2022</span>), grasses (Poaceae) are among the largest plant families and one of the most important for humans. Grasses include the primary food crops rice, maize and wheat, sources of fibre and building materials such as reed and bamboo, and biofuel crops such as sugarcane and switchgrass. Much of the global land surface is covered by grass-dominated ecosystems, where grasses impact productivity, nutrient cycling and vegetation structure by mediating fire and herbivory (Edwards <i>et al</i>., <span>2010</span>; Bond, <span>2016</span>). Grasses are also overrepresented among the world's most damaging agricultural weeds (Holm <i>et al</i>., <span>1977</span>) and invasive plants (Linder <i>et al</i>., <span>2018</span>). Understanding functional diversification, adaptation and novel crop breeding in this important plant group requires a solid understanding of its evolutionary relationships.</p>\n<p>Efforts to uncover the phylogenetic history of grasses have tracked the development of new technology and analytical tools, beginning with cladistic analysis of morphology (e.g. Campbell &amp; Kellogg, <span>1987</span>). Almost as soon as nucleotide sequencing became possible, it was used to investigate grasses (rRNA sequencing, Hamby &amp; Zimmer, <span>1988</span>, and chloroplast DNA, Clark <i>et al</i>., <span>1995</span>), and the results interpreted in the light of known morphology and classification. Hundreds of papers have been published since using nucleic acids, most recently DNA, to assess grass phylogeny at all taxonomic levels and assembling information from all three genomes in the cell (plastid, mitochondrial, and nuclear). These efforts have been punctuated by two major phylogenetic analyses, Grass Phylogeny Working Group I (GPWG, <span>2001</span>) and GPWG II (<span>2012</span>), and family-wide classifications (Kellogg, <span>2015</span>; Soreng <i>et al</i>., <span>2022</span>) were enabled by these and many other detailed phylogenetic analyses.</p>\n<p>The major outlines of grass phylogeny have now been known for several decades and corroborated by accumulating data, with major lineages recognised as subfamilies (Kellogg, <span>2015</span>; Soreng <i>et al</i>., <span>2022</span>). The earliest divergences in the grass family gave rise to three successive lineages, Anomochlooideae, Pharoideae, and Puelioideae, each comprising just a few species. After the divergence of those three, however, the remaining grasses gave rise to two sister lineages, known as BOP and PACMAD, each of which became a species-rich clade with several robust subclades. This sturdy phylogenetic framework is reflected in a strong subfamilial classification, with subfamilies divided into equally robust tribes. Attention in recent years has largely shifted to relationships of tribes, subtribes, and genera.</p>\n<p>Reticulate evolution is common in the grasses. Allopolyploidy is widespread in the family, particularly among closely related species and genera, with as many as 80% of species estimated to be of recent polyploid origin (Stebbins, <span>1985</span>). The textbook example is bread wheat (<i>Triticum aestivum</i>) and its ruderal annual ancestors, the history of which was determined in the first part of the 20<sup>th</sup> century using cytogenetic tools (Kihara, <span>1982</span>; Tsunewaki, <span>2018</span>). Nucleotide sequence data have verified the hybrid origin of wheat and gone on to show that reticulate evolution is the norm in the entire tribe Triticeae (Feldman &amp; Levy, <span>2023</span>; Mason-Gamer &amp; White, <span>2024</span>). We have also learned that three of the four major clades of Bambusoideae are of allopolyploid origin (Triplett <i>et al</i>., <span>2014</span>; Guo <i>et al</i>., <span>2019</span>; Chalopin <i>et al</i>., <span>2021</span>; Ma <i>et al</i>., <span>2024</span>), as are at least one third of the species in Andropogoneae (Estep <i>et al</i>., <span>2014</span>). Large-scale lateral gene transfer has also been demonstrated in <i>Alloteropsis semialata</i> (Dunning <i>et al</i>., <span>2019</span>) and for a number of genomes across the family (Hibdige <i>et al</i>., <span>2021</span>), although it remains unclear how common such genetic exchanges are. Network-like reticulations are therefore expected throughout Poaceae.</p>\n<p>Data relevant to grass phylogeny continue to accumulate in the genomic era, but in an uneven pattern. Major recent studies have inferred family trees based on the plastid genome (Saarela <i>et al</i>., <span>2018</span>; Gallaher <i>et al</i>., <span>2022</span>; Hu <i>et al</i>., <span>2023</span>) or large parts of the nuclear genome (Huang <i>et al</i>., <span>2022</span>). In addition, a wealth of full-genome assemblies is now available for grasses, mainly for groups that have been studied intensively, such as major crops and their congeners including rice (Wang &amp; Han, <span>2022</span>), maize (Hufford <i>et al</i>., <span>2021</span>), wheat (Walkowiak <i>et al</i>., <span>2020</span>) and sugarcane (Healey <i>et al</i>., <span>2024</span>), among many others. At the same time, some genera and many species remain virtually unknown beyond a scientific name and general morphology. While the poorly known taxa may be represented in major herbaria, fresh material can be hard to obtain, weakening attempts to fully sample the grass tree of life with phylogenomic technologies.</p>\n<p>Fortunately, we are now experiencing the confluence of: (1) global sources of diversity data including plant specimens held in herbaria world-wide, (2) widespread use of short-read sequencing that can accommodate even fragmented DNA, (3) analytical tools for assembling and interpreting massive amounts of sequence data, and (4) technical tools for efficient sequencing, such as target capture. For example, the development of a universal probe set for flowering plants, Angiosperms353 (Johnson <i>et al</i>., <span>2019</span>; Baker <i>et al</i>., <span>2021</span>), has enabled initiatives to sequence all angiosperm plant genera (Baker <i>et al</i>., <span>2022</span>; Zuntini <i>et al</i>., <span>2024</span>) or entire continental floras such as that of Australia (https://www.genomicsforaustralianplants.com/). It became apparent that an updated synthesis of existing and new data for grasses, similar to the previous Grass Phylogeny Working Group efforts (GPWG, <span>2001</span>; GPWG II, <span>2012</span>), would be timely and make possible a phylogeny that incorporates representatives of most of the 791 genera of the family using genome-scale data. In the process, we will gain a broader assessment of congruence among nuclear gene histories, including insights on the frequency and impact of incomplete lineage sorting (ILS) and reticulation.</p>\n<p>Accordingly, here we present the most comprehensive nuclear phylogenomic tree of the grass family to date. Via a large community effort, we maximised taxon sampling by combining whole-genome, transcriptome, target capture and shotgun datasets. Based on the Angiosperms353 gene set, we inferred a nuclear multigene species tree using a coalescent-based method that accounts for incongruence due to ILS and uses information from multicopy gene trees. We also inferred a plastome tree and tested for incongruence between plastome and nuclear trees. Finally, we used gene tree–species tree reconciliation analyses to explore the signal for reticulation in the nuclear data.</p>","PeriodicalId":214,"journal":{"name":"New Phytologist","volume":"8 1","pages":""},"PeriodicalIF":8.3000,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"New Phytologist","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1111/nph.20263","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PLANT SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction

With almost 11 800 species in 791 genera (Soreng et al., 2022), grasses (Poaceae) are among the largest plant families and one of the most important for humans. Grasses include the primary food crops rice, maize and wheat, sources of fibre and building materials such as reed and bamboo, and biofuel crops such as sugarcane and switchgrass. Much of the global land surface is covered by grass-dominated ecosystems, where grasses impact productivity, nutrient cycling and vegetation structure by mediating fire and herbivory (Edwards et al., 2010; Bond, 2016). Grasses are also overrepresented among the world's most damaging agricultural weeds (Holm et al., 1977) and invasive plants (Linder et al., 2018). Understanding functional diversification, adaptation and novel crop breeding in this important plant group requires a solid understanding of its evolutionary relationships.

Efforts to uncover the phylogenetic history of grasses have tracked the development of new technology and analytical tools, beginning with cladistic analysis of morphology (e.g. Campbell & Kellogg, 1987). Almost as soon as nucleotide sequencing became possible, it was used to investigate grasses (rRNA sequencing, Hamby & Zimmer, 1988, and chloroplast DNA, Clark et al., 1995), and the results interpreted in the light of known morphology and classification. Hundreds of papers have been published since using nucleic acids, most recently DNA, to assess grass phylogeny at all taxonomic levels and assembling information from all three genomes in the cell (plastid, mitochondrial, and nuclear). These efforts have been punctuated by two major phylogenetic analyses, Grass Phylogeny Working Group I (GPWG, 2001) and GPWG II (2012), and family-wide classifications (Kellogg, 2015; Soreng et al., 2022) were enabled by these and many other detailed phylogenetic analyses.

The major outlines of grass phylogeny have now been known for several decades and corroborated by accumulating data, with major lineages recognised as subfamilies (Kellogg, 2015; Soreng et al., 2022). The earliest divergences in the grass family gave rise to three successive lineages, Anomochlooideae, Pharoideae, and Puelioideae, each comprising just a few species. After the divergence of those three, however, the remaining grasses gave rise to two sister lineages, known as BOP and PACMAD, each of which became a species-rich clade with several robust subclades. This sturdy phylogenetic framework is reflected in a strong subfamilial classification, with subfamilies divided into equally robust tribes. Attention in recent years has largely shifted to relationships of tribes, subtribes, and genera.

Reticulate evolution is common in the grasses. Allopolyploidy is widespread in the family, particularly among closely related species and genera, with as many as 80% of species estimated to be of recent polyploid origin (Stebbins, 1985). The textbook example is bread wheat (Triticum aestivum) and its ruderal annual ancestors, the history of which was determined in the first part of the 20th century using cytogenetic tools (Kihara, 1982; Tsunewaki, 2018). Nucleotide sequence data have verified the hybrid origin of wheat and gone on to show that reticulate evolution is the norm in the entire tribe Triticeae (Feldman & Levy, 2023; Mason-Gamer & White, 2024). We have also learned that three of the four major clades of Bambusoideae are of allopolyploid origin (Triplett et al., 2014; Guo et al., 2019; Chalopin et al., 2021; Ma et al., 2024), as are at least one third of the species in Andropogoneae (Estep et al., 2014). Large-scale lateral gene transfer has also been demonstrated in Alloteropsis semialata (Dunning et al., 2019) and for a number of genomes across the family (Hibdige et al., 2021), although it remains unclear how common such genetic exchanges are. Network-like reticulations are therefore expected throughout Poaceae.

Data relevant to grass phylogeny continue to accumulate in the genomic era, but in an uneven pattern. Major recent studies have inferred family trees based on the plastid genome (Saarela et al., 2018; Gallaher et al., 2022; Hu et al., 2023) or large parts of the nuclear genome (Huang et al., 2022). In addition, a wealth of full-genome assemblies is now available for grasses, mainly for groups that have been studied intensively, such as major crops and their congeners including rice (Wang & Han, 2022), maize (Hufford et al., 2021), wheat (Walkowiak et al., 2020) and sugarcane (Healey et al., 2024), among many others. At the same time, some genera and many species remain virtually unknown beyond a scientific name and general morphology. While the poorly known taxa may be represented in major herbaria, fresh material can be hard to obtain, weakening attempts to fully sample the grass tree of life with phylogenomic technologies.

Fortunately, we are now experiencing the confluence of: (1) global sources of diversity data including plant specimens held in herbaria world-wide, (2) widespread use of short-read sequencing that can accommodate even fragmented DNA, (3) analytical tools for assembling and interpreting massive amounts of sequence data, and (4) technical tools for efficient sequencing, such as target capture. For example, the development of a universal probe set for flowering plants, Angiosperms353 (Johnson et al., 2019; Baker et al., 2021), has enabled initiatives to sequence all angiosperm plant genera (Baker et al., 2022; Zuntini et al., 2024) or entire continental floras such as that of Australia (https://www.genomicsforaustralianplants.com/). It became apparent that an updated synthesis of existing and new data for grasses, similar to the previous Grass Phylogeny Working Group efforts (GPWG, 2001; GPWG II, 2012), would be timely and make possible a phylogeny that incorporates representatives of most of the 791 genera of the family using genome-scale data. In the process, we will gain a broader assessment of congruence among nuclear gene histories, including insights on the frequency and impact of incomplete lineage sorting (ILS) and reticulation.

Accordingly, here we present the most comprehensive nuclear phylogenomic tree of the grass family to date. Via a large community effort, we maximised taxon sampling by combining whole-genome, transcriptome, target capture and shotgun datasets. Based on the Angiosperms353 gene set, we inferred a nuclear multigene species tree using a coalescent-based method that accounts for incongruence due to ILS and uses information from multicopy gene trees. We also inferred a plastome tree and tested for incongruence between plastome and nuclear trees. Finally, we used gene tree–species tree reconciliation analyses to explore the signal for reticulation in the nuclear data.

尽管基因树不一致,禾本科植物(Poaceae)的核系统发生树仍能恢复当前分类
导言禾本科植物(Poaceae)拥有 791 个属近 11 800 个物种(Soreng 等人,2022 年),是最大的植物科之一,也是对人类最重要的植物科之一。禾本科植物包括主要粮食作物水稻、玉米和小麦,纤维来源和建筑材料(如芦苇和竹子),以及生物燃料作物(如甘蔗和switchgrass)。全球大部分陆地表面都覆盖着以草为主的生态系统,草通过调节火灾和食草动物对生产力、养分循环和植被结构产生影响(Edwards 等人,2010 年;Bond,2016 年)。禾本科植物也是世界上危害最大的农业杂草(Holm 等人,1977 年)和入侵植物(Linder 等人,2018 年)。了解这一重要植物类群的功能多样化、适应性和新型作物育种需要对其进化关系有扎实的了解。揭示禾本科植物系统发育历史的努力一直伴随着新技术和分析工具的发展,从形态学的支系分析开始(如 Campbell &amp; Kellogg, 1987)。几乎在核苷酸测序成为可能的同时,它就被用于研究禾本科植物(rRNA 测序,Hamby &amp; Zimmer,1988 年;叶绿体 DNA,Clark 等人,1995 年),并根据已知的形态和分类对结果进行解释。此后,又有数百篇论文利用核酸(最近的是 DNA)来评估所有分类级别的草类系统发育,并将细胞中所有三个基因组(质粒、线粒体和核)的信息整合在一起。这些工作由两个主要的系统发育分析--草系统发育工作组I(GPWG,2001年)和GPWG II(2012年)--来完成,全科的分类(Kellogg,2015年;Soreng等人,2022年)也得益于这些分析和许多其他详细的系统发育分析。草系统发育的主要轮廓现已为人所知数十年,并得到不断积累的数据的证实,主要世系被认定为亚科(Kellogg,2015年;Soreng等人,2022年)。禾本科最早的分化产生了三个连续的支系,即 Anomochlooideae、Pharoideae 和 Puelioideae,每个支系都只有几个物种。然而,在这三个支系分化之后,剩下的禾本科植物产生了两个姐妹支系,即 BOP 支系和 PACMAD 支系,这两个支系各自成为一个物种丰富的支系,并有几个强大的亚支系。这一坚固的系统发育框架反映在强大的亚科分类中,亚科又分为同样强大的支系。近年来,人们的注意力主要转移到部落、亚部落和属之间的关系上。异源多倍体在禾本科中很普遍,特别是在亲缘关系很近的种和属之间,据估计多达 80% 的物种是最近才起源的多倍体(Stebbins,1985 年)。教科书上的例子是面包小麦(Triticum aestivum)及其原始的一年生祖先,其历史是在 20 世纪上半叶利用细胞遗传学工具确定的(Kihara,1982 年;Tsunewaki,2018 年)。核苷酸序列数据验证了小麦的杂交起源,进而表明网状进化是整个三叶草科的常态(Feldman &amp; Levy, 2023; Mason-Gamer &amp; White, 2024)。我们还了解到,Bambusoideae 的四个主要支系中有三个支系起源于异源多倍体(Triplett 等人,2014 年;Guo 等人,2019 年;Chalopin 等人,2021 年;Ma 等人,2024 年),Andropogoneae 中至少三分之一的物种也是如此(Estep 等人,2014 年)。在 Alloteropsis semialata(Dunning 等人,2019 年)和整个科的一些基因组(Hibdige 等人,2021 年)中也证明了大规模的横向基因转移,但目前仍不清楚这种基因交换有多普遍。在基因组时代,与禾本科系统发育相关的数据不断积累,但模式并不均衡。最近的一些主要研究基于质体基因组(Saarela 等人,2018 年;Gallaher 等人,2022 年;Hu 等人,2023 年)或大部分核基因组(Huang 等人,2022 年)推断出了家族树。此外,目前已有大量禾本科植物的全基因组组装,主要是针对那些已被深入研究过的类群,如主要农作物及其同源物,包括水稻(Wang &amp; Han, 2022)、玉米(Hufford 等人, 2021)、小麦(Walkowiak 等人, 2020)和甘蔗(Healey 等人, 2024)等。与此同时,一些属和许多种除了学名和一般形态外几乎仍不为人所知。虽然这些鲜为人知的类群可能在主要的植物标本馆中有所体现,但新鲜材料可能难以获得,从而削弱了利用系统发生组学技术对草类生命树进行全面采样的尝试。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
New Phytologist
New Phytologist 生物-植物科学
自引率
5.30%
发文量
728
期刊介绍: New Phytologist is an international electronic journal published 24 times a year. It is owned by the New Phytologist Foundation, a non-profit-making charitable organization dedicated to promoting plant science. The journal publishes excellent, novel, rigorous, and timely research and scholarship in plant science and its applications. The articles cover topics in five sections: Physiology & Development, Environment, Interaction, Evolution, and Transformative Plant Biotechnology. These sections encompass intracellular processes, global environmental change, and encourage cross-disciplinary approaches. The journal recognizes the use of techniques from molecular and cell biology, functional genomics, modeling, and system-based approaches in plant science. Abstracting and Indexing Information for New Phytologist includes Academic Search, AgBiotech News & Information, Agroforestry Abstracts, Biochemistry & Biophysics Citation Index, Botanical Pesticides, CAB Abstracts®, Environment Index, Global Health, and Plant Breeding Abstracts, and others.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信