Merging and concatenation of sequencing reads: a bioinformatics workflow for the comprehensive profiling of microbiome from amplicon data.

IF 2.2 4区 生物学 Q3 MICROBIOLOGY
Meganathan P Ramakodi
{"title":"Merging and concatenation of sequencing reads: a bioinformatics workflow for the comprehensive profiling of microbiome from amplicon data.","authors":"Meganathan P Ramakodi","doi":"10.1093/femsle/fnae009","DOIUrl":null,"url":null,"abstract":"<p><p>A comprehensive profiling of microbial diversity is essential to understand the ecosystem functions. Universal primer sets such as the 515Y/926R could amplify a part of 16S and 18S rRNA and infer the diversity of prokaryotes and eukaryotes. However, the analyses of mixed sequencing data pose a bioinformatics challenge; the 16S and 18S rRNA sequences need to be separated first and analysed individually/independently due to variations in the amplicon length. This study describes an alternative strategy, a merging and concatenation workflow, to analyse the mixed amplicon data without separating the 16S and 18S rRNA sequences. The workflow was tested with 24 mock community (MC) samples, and the analyses resolved the composition of prokaryotes and eukaryotes adequately. In addition, there was a strong correlation (cor = 0.950; P-value = 4.754e-10) between the observed and expected abundances in the MC samples, which suggests that the computational approach could infer the microbial proportions accurately. Further, 18 samples collected from the Sundarbans mangrove region were analysed as a case study. The analyses identified Proteobacteria, Bacteroidota, Actinobacteriota, Cyanobacteria, and Crenarchaeota as dominant bacterial phyla and eukaryotic divisions such as Metazoa, Gyrista, Cryptophyta, Chlorophyta, and Dinoflagellata were found to be dominant in the samples. Thus, the results support the applicability of the method in environmental microbiome research. The merging and concatenation workflow presented here requires considerably less computational resources and uses widely/commonly used bioinformatics packages, saving researchers analyses time (for equivalent sample numbers, compared to the conventional approach) required to infer the diversity of major microbial domains from mixed amplicon data at comparable accuracy.</p>","PeriodicalId":12214,"journal":{"name":"Fems Microbiology Letters","volume":" ","pages":""},"PeriodicalIF":2.2000,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fems Microbiology Letters","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/femsle/fnae009","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

A comprehensive profiling of microbial diversity is essential to understand the ecosystem functions. Universal primer sets such as the 515Y/926R could amplify a part of 16S and 18S rRNA and infer the diversity of prokaryotes and eukaryotes. However, the analyses of mixed sequencing data pose a bioinformatics challenge; the 16S and 18S rRNA sequences need to be separated first and analysed individually/independently due to variations in the amplicon length. This study describes an alternative strategy, a merging and concatenation workflow, to analyse the mixed amplicon data without separating the 16S and 18S rRNA sequences. The workflow was tested with 24 mock community (MC) samples, and the analyses resolved the composition of prokaryotes and eukaryotes adequately. In addition, there was a strong correlation (cor = 0.950; P-value = 4.754e-10) between the observed and expected abundances in the MC samples, which suggests that the computational approach could infer the microbial proportions accurately. Further, 18 samples collected from the Sundarbans mangrove region were analysed as a case study. The analyses identified Proteobacteria, Bacteroidota, Actinobacteriota, Cyanobacteria, and Crenarchaeota as dominant bacterial phyla and eukaryotic divisions such as Metazoa, Gyrista, Cryptophyta, Chlorophyta, and Dinoflagellata were found to be dominant in the samples. Thus, the results support the applicability of the method in environmental microbiome research. The merging and concatenation workflow presented here requires considerably less computational resources and uses widely/commonly used bioinformatics packages, saving researchers analyses time (for equivalent sample numbers, compared to the conventional approach) required to infer the diversity of major microbial domains from mixed amplicon data at comparable accuracy.

测序读数的合并与连接:从扩增子数据全面分析微生物组的生物信息学工作流程。
全面分析微生物多样性对了解生态系统功能至关重要。通用引物集(如 515Y/926R 引物集)可扩增 16S 和 18S rRNA 的一部分,从而推断原核生物和真核生物的多样性。然而,混合测序数据的分析给生物信息学带来了挑战;由于扩增子长度的差异,需要首先分离 16S 和 18S rRNA 序列,并对其进行单独/独立分析。本研究介绍了一种替代策略,即合并和连接工作流程,无需分离 16S 和 18S rRNA 序列即可分析混合扩增子数据。用 24 个模拟群落(MC)样本对该工作流程进行了测试,分析结果充分证明了原核生物和真核生物的组成。此外,MC 样本中观察到的丰度与预期丰度之间存在很强的相关性(cor=0.950;P 值=4.754e-10),这表明计算方法可以准确推断微生物的比例。此外,还对从孙德尔本斯红树林地区采集的 18 份样本进行了案例分析。分析结果表明,蛋白质细菌群、类杆菌群、放线菌群、蓝细菌群和子囊菌群(Crenarchaeota)是主要的细菌门,而真核生物门,如元胞动物门(Metazoa)、裸子植物门(Gyrista)、隐胞动物门(Cryptophyta_X)、叶绿体门(Chlorophyta_X)和甲藻门(Dinoflagellata)在样本中占主导地位。因此,结果支持该方法在环境微生物组研究中的适用性。本文介绍的合并和串联工作流程所需的计算资源要少得多,而且使用的是广泛/常用的生物信息学软件包,为研究人员节省了分析时间(与传统方法相比,在样本数量相当的情况下),从而能从混合扩增子数据中推断出主要微生物域的多样性,且准确性相当。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Fems Microbiology Letters
Fems Microbiology Letters 生物-微生物学
CiteScore
4.30
自引率
0.00%
发文量
112
审稿时长
1.9 months
期刊介绍: FEMS Microbiology Letters gives priority to concise papers that merit rapid publication by virtue of their originality, general interest and contribution to new developments in microbiology. All aspects of microbiology, including virology, are covered. 2019 Impact Factor: 1.987, Journal Citation Reports (Source Clarivate, 2020) Ranking: 98/135 (Microbiology) The journal is divided into eight Sections: Physiology and Biochemistry (including genetics, molecular biology and ‘omic’ studies) Food Microbiology (from food production and biotechnology to spoilage and food borne pathogens) Biotechnology and Synthetic Biology Pathogens and Pathogenicity (including medical, veterinary, plant and insect pathogens – particularly those relating to food security – with the exception of viruses) Environmental Microbiology (including ecophysiology, ecogenomics and meta-omic studies) Virology (viruses infecting any organism, including Bacteria and Archaea) Taxonomy and Systematics (for publication of novel taxa, taxonomic reclassifications and reviews of a taxonomic nature) Professional Development (including education, training, CPD, research assessment frameworks, research and publication metrics, best-practice, careers and history of microbiology) If you are unsure which Section is most appropriate for your manuscript, for example in the case of transdisciplinary studies, we recommend that you contact the Editor-In-Chief by email prior to submission. Our scope includes any type of microorganism - all members of the Bacteria and the Archaea and microbial members of the Eukarya (yeasts, filamentous fungi, microbial algae, protozoa, oomycetes, myxomycetes, etc.) as well as all viruses.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信