De novo clustering of long-read amplicons improves phylogenetic insight into microbiome data.

IF 12.2 1区 医学 Q1 GASTROENTEROLOGY & HEPATOLOGY
Gut Microbes Pub Date : 2025-12-01 Epub Date: 2025-06-11 DOI:10.1080/19490976.2025.2516703
Yan Hui, Dennis Sandris Nielsen, Lukasz Krych
{"title":"<i>De novo</i> clustering of long-read amplicons improves phylogenetic insight into microbiome data.","authors":"Yan Hui, Dennis Sandris Nielsen, Lukasz Krych","doi":"10.1080/19490976.2025.2516703","DOIUrl":null,"url":null,"abstract":"<p><p>Long-read amplicon profiling through read classification limits phylogenetic analysis of amplicons while community analysis of multicopy genes, relying on unique molecular identifier (UMI) corrections, often demands deep sequencing. To address this, we present a long amplicon consensus analysis (LACA) workflow employing multiple <i>de novo</i> clustering approaches based on sequence dissimilarity. LACA controls the average error rate of corrected sequences below 1% for the Oxford Nanopore Technologies (ONT) R9.4.1 and ONT R10.3 data, 0.2% for ONT R10.4.1, and 0.1% for high-accuracy ONT Duplex and Pacific Biosciences (PacBio) circular consensus sequencing (CCS) data in both simulated 16S rRNA and real 16-23S rRNA amplicon datasets. In high-accuracy PacBio CCS data, the clustering-based correction matched UMI correction, while outperforming 4× UMI correction in noisy ONT R10.3 and R9.4.1 data. Notably, LACA preserved phylogenetic fidelity in long operational taxonomic units and enhanced microbiome-wide phenotype characterization for synthetic mock communities and human vaginal samples.</p>","PeriodicalId":12909,"journal":{"name":"Gut Microbes","volume":"17 1","pages":"2516703"},"PeriodicalIF":12.2000,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12160608/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Gut Microbes","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1080/19490976.2025.2516703","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/11 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Long-read amplicon profiling through read classification limits phylogenetic analysis of amplicons while community analysis of multicopy genes, relying on unique molecular identifier (UMI) corrections, often demands deep sequencing. To address this, we present a long amplicon consensus analysis (LACA) workflow employing multiple de novo clustering approaches based on sequence dissimilarity. LACA controls the average error rate of corrected sequences below 1% for the Oxford Nanopore Technologies (ONT) R9.4.1 and ONT R10.3 data, 0.2% for ONT R10.4.1, and 0.1% for high-accuracy ONT Duplex and Pacific Biosciences (PacBio) circular consensus sequencing (CCS) data in both simulated 16S rRNA and real 16-23S rRNA amplicon datasets. In high-accuracy PacBio CCS data, the clustering-based correction matched UMI correction, while outperforming 4× UMI correction in noisy ONT R10.3 and R9.4.1 data. Notably, LACA preserved phylogenetic fidelity in long operational taxonomic units and enhanced microbiome-wide phenotype characterization for synthetic mock communities and human vaginal samples.

长读扩增子的从头聚类提高了对微生物组数据的系统发育洞察力。
通过读分类的长读扩增子分析限制了扩增子的系统发育分析,而多拷贝基因的群落分析依赖于唯一分子标识符(UMI)校正,通常需要深度测序。为了解决这个问题,我们提出了一个长扩增子共识分析(LACA)工作流,采用基于序列不相似性的多个从头聚类方法。LACA在模拟16S rRNA和真实16-23S rRNA扩增子数据集中,对Oxford Nanopore Technologies (ONT) R9.4.1和ONT R10.3数据的校正序列平均错误率控制在1%以下,对ONT R10.4.1数据的校正序列平均错误率控制在0.2%,对高精度ONT Duplex和Pacific Biosciences (PacBio)循环共识测序(CCS)数据的校正序列平均错误率控制在0.1%。在高精度PacBio CCS数据中,基于聚类的校正与UMI校正相匹配,而在有噪声的ONT R10.3和R9.4.1数据中优于4倍UMI校正。值得注意的是,LACA保留了长操作分类单位的系统发育保真度,并增强了合成模拟群落和人类阴道样本的微生物群全表型表征。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Gut Microbes
Gut Microbes Medicine-Microbiology (medical)
CiteScore
18.20
自引率
3.30%
发文量
196
审稿时长
10 weeks
期刊介绍: The intestinal microbiota plays a crucial role in human physiology, influencing various aspects of health and disease such as nutrition, obesity, brain function, allergic responses, immunity, inflammatory bowel disease, irritable bowel syndrome, cancer development, cardiac disease, liver disease, and more. Gut Microbes serves as a platform for showcasing and discussing state-of-the-art research related to the microorganisms present in the intestine. The journal emphasizes mechanistic and cause-and-effect studies. Additionally, it has a counterpart, Gut Microbes Reports, which places a greater focus on emerging topics and comparative and incremental studies.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信