Comparative analysis of HiSeq3000 and BGISEQ-500 sequencing platform over whole genome sequencing metagenomics data

Animesh Kumar, E. Robertsen, N. Willassen, Juan Fu, Erik Hjerde
{"title":"Comparative analysis of HiSeq3000 and BGISEQ-500 sequencing platform over whole genome sequencing metagenomics data","authors":"Animesh Kumar, E. Robertsen, N. Willassen, Juan Fu, Erik Hjerde","doi":"10.5808/gi.23072","DOIUrl":null,"url":null,"abstract":"Recent advances in sequencing technologies and platforms have enabled to generate metagenomics sequences using different sequencing platforms. In this study, we analyzed and compared shotgun metagenomic sequences generated by HiSeq3000 and BGISEQ-500 platforms from 12 sediment samples collected across the Norwegian coast. Metagenomics DNA sequences were normalized to an equal number of bases for both platforms and further evaluated by using different taxonomic classifiers, reference databases, and assemblers. Normalized BGISEQ-500 sequences retained more reads and base counts after preprocessing, while a slightly higher fraction of HiSeq3000 sequences were taxonomically classified. Kaiju classified a higher percentage of reads relative to Kraken2 for both platforms, and comparison of reference database for taxonomic classification showed that MAR database outperformed RefSeq. Assembly using MEGAHIT produced longer assemblies and higher total contigs count in majority of HiSeq3000 samples than using metaSPAdes, but the assembly statistics notably improved with unprocessed or normalized reads. Our results indicate that both platforms perform comparably in terms of the percentage of taxonomically classified reads and assembled contig statistics for metagenomics samples. This study provides valuable insights for researchers in selecting an appropriate sequencing platform and bioinformatics pipeline for their metagenomics studies.","PeriodicalId":197222,"journal":{"name":"Genomics & Informatics","volume":"108 14","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genomics & Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5808/gi.23072","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Recent advances in sequencing technologies and platforms have enabled to generate metagenomics sequences using different sequencing platforms. In this study, we analyzed and compared shotgun metagenomic sequences generated by HiSeq3000 and BGISEQ-500 platforms from 12 sediment samples collected across the Norwegian coast. Metagenomics DNA sequences were normalized to an equal number of bases for both platforms and further evaluated by using different taxonomic classifiers, reference databases, and assemblers. Normalized BGISEQ-500 sequences retained more reads and base counts after preprocessing, while a slightly higher fraction of HiSeq3000 sequences were taxonomically classified. Kaiju classified a higher percentage of reads relative to Kraken2 for both platforms, and comparison of reference database for taxonomic classification showed that MAR database outperformed RefSeq. Assembly using MEGAHIT produced longer assemblies and higher total contigs count in majority of HiSeq3000 samples than using metaSPAdes, but the assembly statistics notably improved with unprocessed or normalized reads. Our results indicate that both platforms perform comparably in terms of the percentage of taxonomically classified reads and assembled contig statistics for metagenomics samples. This study provides valuable insights for researchers in selecting an appropriate sequencing platform and bioinformatics pipeline for their metagenomics studies.
HiSeq3000 和 BGISEQ-500 测序平台对全基因组测序元基因组学数据的比较分析
测序技术和平台的最新进展使得利用不同测序平台生成元基因组序列成为可能。在这项研究中,我们分析并比较了从挪威海岸采集的12个沉积物样本中,由HiSeq3000和BGISEQ-500平台生成的枪式元基因组序列。两种平台的元基因组 DNA 序列均归一化为相同的碱基数,并通过使用不同的分类分类器、参考数据库和组合器进行进一步评估。归一化的 BGISEQ-500 序列在预处理后保留了更多的读数和碱基数,而 HiSeq3000 序列的分类比例略高。与 Kraken2 相比,Kaiju 对两种平台的读数进行分类的比例更高,对分类参考数据库进行比较后发现,MAR 数据库的分类结果优于 RefSeq。与使用 metaSPAdes 相比,在大多数 HiSeq3000 样本中,使用 MEGAHIT 进行的组装产生了更长的组装和更高的总片段数,但使用未经处理或归一化的读数时,组装统计量明显提高。我们的研究结果表明,这两种平台在元基因组学样本的分类读数百分比和组装等位基因统计方面表现相当。这项研究为研究人员选择合适的测序平台和生物信息学流水线进行元基因组学研究提供了有价值的见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信