A Targeted Reference Database for Improved Analysis of Environmental 16S rRNA Oxford Nanopore Sequencing Data.

IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY
Melcy Philip, Tonje Nilsen, Sanna Majaneva, Ragnhild Pettersen, Morten Stokkan, Jessica Louise Ray, Nigel Keeley, Knut Rudi, Lars-Gustav Snipen
{"title":"A Targeted Reference Database for Improved Analysis of Environmental 16S rRNA Oxford Nanopore Sequencing Data.","authors":"Melcy Philip, Tonje Nilsen, Sanna Majaneva, Ragnhild Pettersen, Morten Stokkan, Jessica Louise Ray, Nigel Keeley, Knut Rudi, Lars-Gustav Snipen","doi":"10.1111/1755-0998.70036","DOIUrl":null,"url":null,"abstract":"<p><p>The Oxford Nanopore Technologies (ONT) sequencing platform is compact and efficient, making it suitable for rapid biodiversity assessments in remote areas. Despite its long reads, ONT has a higher error rate compared to other platforms; necessitating high-quality reference databases for accurate taxonomic assignments. However, the absence of targeted databases for underexplored habitats, such as the seafloor, limits ONT's broader applicability for exploratory analysis. To address this, we propose an approach for building environmentally targeted databases to improve 16S rRNA gene (16S) analysis using Oxford Nanopore Technologies (ONT), using seafloor sediment samples from the Norwegian coast as an example. We started by using Illumina short-read data to create a database of full-length or near full-length 16S sequences from seafloor samples. Initially, amplicons are mapped to the SILVA database, with matches added to our database. Unmatched amplicons are reconstructed using METASEED and Barrnap methodologies with amplicon and metagenome data. Finally, if the previous strategies did not succeed, we included the short-read sequences in the database. This resulted in AQUAeD-DB, which contains 14,545 16S sequences clustered at 95% identity. Comparative database analysis reveals that AQUAeD-DB provides consistent results for both Illumina and Nanopore read assignments (median correlation coefficient: 0.50), whereas a standard database showed a substantially weaker correlation. These findings also emphasise its potential to recognise both high and low abundance taxa, which could be key indicators in environmental studies. This work highlights the necessity of targeted databases for environmental analysis, especially for ONT-based studies, and lays the foundations for future extension of the database.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e70036"},"PeriodicalIF":5.5000,"publicationDate":"2025-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Ecology Resources","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1111/1755-0998.70036","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

The Oxford Nanopore Technologies (ONT) sequencing platform is compact and efficient, making it suitable for rapid biodiversity assessments in remote areas. Despite its long reads, ONT has a higher error rate compared to other platforms; necessitating high-quality reference databases for accurate taxonomic assignments. However, the absence of targeted databases for underexplored habitats, such as the seafloor, limits ONT's broader applicability for exploratory analysis. To address this, we propose an approach for building environmentally targeted databases to improve 16S rRNA gene (16S) analysis using Oxford Nanopore Technologies (ONT), using seafloor sediment samples from the Norwegian coast as an example. We started by using Illumina short-read data to create a database of full-length or near full-length 16S sequences from seafloor samples. Initially, amplicons are mapped to the SILVA database, with matches added to our database. Unmatched amplicons are reconstructed using METASEED and Barrnap methodologies with amplicon and metagenome data. Finally, if the previous strategies did not succeed, we included the short-read sequences in the database. This resulted in AQUAeD-DB, which contains 14,545 16S sequences clustered at 95% identity. Comparative database analysis reveals that AQUAeD-DB provides consistent results for both Illumina and Nanopore read assignments (median correlation coefficient: 0.50), whereas a standard database showed a substantially weaker correlation. These findings also emphasise its potential to recognise both high and low abundance taxa, which could be key indicators in environmental studies. This work highlights the necessity of targeted databases for environmental analysis, especially for ONT-based studies, and lays the foundations for future extension of the database.

环境16S rRNA牛津纳米孔测序数据改进分析的目标参考数据库。
牛津纳米孔技术公司(ONT)的测序平台结构紧凑、效率高,适用于偏远地区的生物多样性快速评估。尽管读取时间很长,但与其他平台相比,ONT的错误率更高;需要高质量的参考数据库进行准确的分类分配。然而,缺乏针对未开发栖息地(如海底)的目标数据库,限制了ONT在探索性分析方面的广泛适用性。为了解决这个问题,我们提出了一种利用牛津纳米孔技术(ONT)建立环境目标数据库的方法,以改进16S rRNA基因(16S)分析,并以挪威海岸的海底沉积物样本为例。我们首先使用Illumina短读数据从海底样本中创建一个全长或近全长16S序列的数据库。最初,扩增子被映射到SILVA数据库,匹配子被添加到我们的数据库中。利用METASEED和Barrnap方法,利用扩增子和宏基因组数据重建不匹配的扩增子。最后,如果前面的策略不成功,我们将短读序列纳入数据库。结果得到了aquae - db,该序列包含14545个16S序列,同源性为95%。对比数据库分析显示,aquae - db提供了Illumina和Nanopore读取分配的一致结果(中位数相关系数为0.50),而标准数据库显示出明显较弱的相关性。这些发现还强调了它识别高丰度和低丰度分类群的潜力,这可能是环境研究的关键指标。这项工作强调了环境分析,特别是基于ont的研究的目标数据库的必要性,并为数据库的未来扩展奠定了基础。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Molecular Ecology Resources
Molecular Ecology Resources 生物-进化生物学
CiteScore
15.60
自引率
5.20%
发文量
170
审稿时长
3 months
期刊介绍: Molecular Ecology Resources promotes the creation of comprehensive resources for the scientific community, encompassing computer programs, statistical and molecular advancements, and a diverse array of molecular tools. Serving as a conduit for disseminating these resources, the journal targets a broad audience of researchers in the fields of evolution, ecology, and conservation. Articles in Molecular Ecology Resources are crafted to support investigations tackling significant questions within these disciplines. In addition to original resource articles, Molecular Ecology Resources features Reviews, Opinions, and Comments relevant to the field. The journal also periodically releases Special Issues focusing on resource development within specific areas.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信