Single-fly genome assemblies fill major phylogenomic gaps across the Drosophilidae Tree of Life.

IF 9.8 1区 生物学 Q1 Agricultural and Biological Sciences
PLoS Biology Pub Date : 2024-07-18 eCollection Date: 2024-07-01 DOI:10.1371/journal.pbio.3002697
Bernard Y Kim, Hannah R Gellert, Samuel H Church, Anton Suvorov, Sean S Anderson, Olga Barmina, Sofia G Beskid, Aaron A Comeault, K Nicole Crown, Sarah E Diamond, Steve Dorus, Takako Fujichika, James A Hemker, Jan Hrcek, Maaria Kankare, Toru Katoh, Karl N Magnacca, Ryan A Martin, Teruyuki Matsunaga, Matthew J Medeiros, Danny E Miller, Scott Pitnick, Michele Schiffer, Sara Simoni, Tessa E Steenwinkel, Zeeshan A Syed, Aya Takahashi, Kevin H-C Wei, Tsuya Yokoyama, Michael B Eisen, Artyom Kopp, Daniel Matute, Darren J Obbard, Patrick M O'Grady, Donald K Price, Masanori J Toda, Thomas Werner, Dmitri A Petrov
{"title":"Single-fly genome assemblies fill major phylogenomic gaps across the Drosophilidae Tree of Life.","authors":"Bernard Y Kim, Hannah R Gellert, Samuel H Church, Anton Suvorov, Sean S Anderson, Olga Barmina, Sofia G Beskid, Aaron A Comeault, K Nicole Crown, Sarah E Diamond, Steve Dorus, Takako Fujichika, James A Hemker, Jan Hrcek, Maaria Kankare, Toru Katoh, Karl N Magnacca, Ryan A Martin, Teruyuki Matsunaga, Matthew J Medeiros, Danny E Miller, Scott Pitnick, Michele Schiffer, Sara Simoni, Tessa E Steenwinkel, Zeeshan A Syed, Aya Takahashi, Kevin H-C Wei, Tsuya Yokoyama, Michael B Eisen, Artyom Kopp, Daniel Matute, Darren J Obbard, Patrick M O'Grady, Donald K Price, Masanori J Toda, Thomas Werner, Dmitri A Petrov","doi":"10.1371/journal.pbio.3002697","DOIUrl":null,"url":null,"abstract":"<p><p>Long-read sequencing is driving rapid progress in genome assembly across all major groups of life, including species of the family Drosophilidae, a longtime model system for genetics, genomics, and evolution. We previously developed a cost-effective hybrid Oxford Nanopore (ONT) long-read and Illumina short-read sequencing approach and used it to assemble 101 drosophilid genomes from laboratory cultures, greatly increasing the number of genome assemblies for this taxonomic group. The next major challenge is to address the laboratory culture bias in taxon sampling by sequencing genomes of species that cannot easily be reared in the lab. Here, we build upon our previous methods to perform amplification-free ONT sequencing of single wild flies obtained either directly from the field or from ethanol-preserved specimens in museum collections, greatly improving the representation of lesser studied drosophilid taxa in whole-genome data. Using Illumina Novaseq X Plus and ONT P2 sequencers with R10.4.1 chemistry, we set a new benchmark for inexpensive hybrid genome assembly at US $150 per genome while assembling genomes from as little as 35 ng of genomic DNA from a single fly. We present 183 new genome assemblies for 179 species as a resource for drosophilid systematics, phylogenetics, and comparative genomics. Of these genomes, 62 are from pooled lab strains and 121 from single adult flies. Despite the sample limitations of working with small insects, most single-fly diploid assemblies are comparable in contiguity (>1 Mb contig N50), completeness (>98% complete dipteran BUSCOs), and accuracy (>QV40 genome-wide with ONT R10.4.1) to assemblies from inbred lines. We present a well-resolved multi-locus phylogeny for 360 drosophilid and 4 outgroup species encompassing all publicly available (as of August 2023) genomes for this group. Finally, we present a Progressive Cactus whole-genome, reference-free alignment built from a subset of 298 suitably high-quality drosophilid genomes. The new assemblies and alignment, along with updated laboratory protocols and computational pipelines, are released as an open resource and as a tool for studying evolution at the scale of an entire insect family.</p>","PeriodicalId":49001,"journal":{"name":"PLoS Biology","volume":null,"pages":null},"PeriodicalIF":9.8000,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11257246/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1371/journal.pbio.3002697","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"Agricultural and Biological Sciences","Score":null,"Total":0}
引用次数: 0

Abstract

Long-read sequencing is driving rapid progress in genome assembly across all major groups of life, including species of the family Drosophilidae, a longtime model system for genetics, genomics, and evolution. We previously developed a cost-effective hybrid Oxford Nanopore (ONT) long-read and Illumina short-read sequencing approach and used it to assemble 101 drosophilid genomes from laboratory cultures, greatly increasing the number of genome assemblies for this taxonomic group. The next major challenge is to address the laboratory culture bias in taxon sampling by sequencing genomes of species that cannot easily be reared in the lab. Here, we build upon our previous methods to perform amplification-free ONT sequencing of single wild flies obtained either directly from the field or from ethanol-preserved specimens in museum collections, greatly improving the representation of lesser studied drosophilid taxa in whole-genome data. Using Illumina Novaseq X Plus and ONT P2 sequencers with R10.4.1 chemistry, we set a new benchmark for inexpensive hybrid genome assembly at US $150 per genome while assembling genomes from as little as 35 ng of genomic DNA from a single fly. We present 183 new genome assemblies for 179 species as a resource for drosophilid systematics, phylogenetics, and comparative genomics. Of these genomes, 62 are from pooled lab strains and 121 from single adult flies. Despite the sample limitations of working with small insects, most single-fly diploid assemblies are comparable in contiguity (>1 Mb contig N50), completeness (>98% complete dipteran BUSCOs), and accuracy (>QV40 genome-wide with ONT R10.4.1) to assemblies from inbred lines. We present a well-resolved multi-locus phylogeny for 360 drosophilid and 4 outgroup species encompassing all publicly available (as of August 2023) genomes for this group. Finally, we present a Progressive Cactus whole-genome, reference-free alignment built from a subset of 298 suitably high-quality drosophilid genomes. The new assemblies and alignment, along with updated laboratory protocols and computational pipelines, are released as an open resource and as a tool for studying evolution at the scale of an entire insect family.

单蝇基因组组装填补了整个果蝇科生命树的主要系统发生学空白。
长读程测序技术正在推动所有主要生命群体基因组组装的快速进展,包括长期以来作为遗传学、基因组学和进化模型系统的果蝇科物种。我们之前开发了一种具有成本效益的牛津纳米孔(ONT)长读数和Illumina短读数混合测序方法,并用它从实验室培养物中组装了101个果蝇基因组,大大增加了该分类群的基因组组装数量。下一个主要挑战是通过对实验室不易饲养的物种进行基因组测序,解决分类群取样中的实验室培养偏差问题。在此,我们在先前方法的基础上,对直接从野外或从博物馆收藏的乙醇保存标本中获得的单个野生苍蝇进行了无扩增 ONT 测序,大大提高了研究较少的果蝇类群在全基因组数据中的代表性。我们使用配备 R10.4.1 化学试剂的 Illumina Novaseq X Plus 和 ONT P2 测序仪,以每个基因组 150 美元的价格为廉价的混合基因组组装设定了新的基准,同时只需从单个苍蝇的 35 纳克基因组 DNA 中组装基因组。我们为 179 个物种提供了 183 个新的基因组,作为嗜蝇类系统学、系统发生学和比较基因组学的资源。在这些基因组中,62 个来自实验室的集合菌株,121 个来自单个成蝇。尽管小型昆虫的样本有限,但大多数单蝇二倍体基因组在连续性(>1 Mb contig N50)、完整性(>98% 完整的双翅目 BUSCOs)和准确性(使用 ONT R10.4.1 时,全基因组>QV40)方面与近交系的基因组相当。我们提出了 360 个嗜双翅目物种和 4 个外群物种的解析度较高的多焦点系统发生,涵盖了该类群所有公开可用(截至 2023 年 8 月)的基因组。最后,我们介绍了从 298 个高质量的嗜酸果蝇基因组中挑选出来的 Progressive Cactus 全基因组无参考文献比对。新的组装和比对以及更新的实验室协议和计算管道将作为开放资源和研究整个昆虫家族进化的工具发布。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
PLoS Biology
PLoS Biology BIOCHEMISTRY & MOLECULAR BIOLOGY-BIOLOGY
CiteScore
15.40
自引率
2.00%
发文量
359
审稿时长
3-8 weeks
期刊介绍: PLOS Biology is the flagship journal of the Public Library of Science (PLOS) and focuses on publishing groundbreaking and relevant research in all areas of biological science. The journal features works at various scales, ranging from molecules to ecosystems, and also encourages interdisciplinary studies. PLOS Biology publishes articles that demonstrate exceptional significance, originality, and relevance, with a high standard of scientific rigor in methodology, reporting, and conclusions. The journal aims to advance science and serve the research community by transforming research communication to align with the research process. It offers evolving article types and policies that empower authors to share the complete story behind their scientific findings with a diverse global audience of researchers, educators, policymakers, patient advocacy groups, and the general public. PLOS Biology, along with other PLOS journals, is widely indexed by major services such as Crossref, Dimensions, DOAJ, Google Scholar, PubMed, PubMed Central, Scopus, and Web of Science. Additionally, PLOS Biology is indexed by various other services including AGRICOLA, Biological Abstracts, BIOSYS Previews, CABI CAB Abstracts, CABI Global Health, CAPES, CAS, CNKI, Embase, Journal Guide, MEDLINE, and Zoological Record, ensuring that the research content is easily accessible and discoverable by a wide range of audiences.
文献相关原料
公司名称 产品信息 采购帮参考价格
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信