{"title":"长读扩增子的从头聚类提高了对微生物组数据的系统发育洞察力。","authors":"Yan Hui, Dennis Sandris Nielsen, Lukasz Krych","doi":"10.1080/19490976.2025.2516703","DOIUrl":null,"url":null,"abstract":"<p><p>Long-read amplicon profiling through read classification limits phylogenetic analysis of amplicons while community analysis of multicopy genes, relying on unique molecular identifier (UMI) corrections, often demands deep sequencing. To address this, we present a long amplicon consensus analysis (LACA) workflow employing multiple <i>de novo</i> clustering approaches based on sequence dissimilarity. LACA controls the average error rate of corrected sequences below 1% for the Oxford Nanopore Technologies (ONT) R9.4.1 and ONT R10.3 data, 0.2% for ONT R10.4.1, and 0.1% for high-accuracy ONT Duplex and Pacific Biosciences (PacBio) circular consensus sequencing (CCS) data in both simulated 16S rRNA and real 16-23S rRNA amplicon datasets. In high-accuracy PacBio CCS data, the clustering-based correction matched UMI correction, while outperforming 4× UMI correction in noisy ONT R10.3 and R9.4.1 data. Notably, LACA preserved phylogenetic fidelity in long operational taxonomic units and enhanced microbiome-wide phenotype characterization for synthetic mock communities and human vaginal samples.</p>","PeriodicalId":12909,"journal":{"name":"Gut Microbes","volume":"17 1","pages":"2516703"},"PeriodicalIF":12.2000,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12160608/pdf/","citationCount":"0","resultStr":"{\"title\":\"<i>De novo</i> clustering of long-read amplicons improves phylogenetic insight into microbiome data.\",\"authors\":\"Yan Hui, Dennis Sandris Nielsen, Lukasz Krych\",\"doi\":\"10.1080/19490976.2025.2516703\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Long-read amplicon profiling through read classification limits phylogenetic analysis of amplicons while community analysis of multicopy genes, relying on unique molecular identifier (UMI) corrections, often demands deep sequencing. To address this, we present a long amplicon consensus analysis (LACA) workflow employing multiple <i>de novo</i> clustering approaches based on sequence dissimilarity. LACA controls the average error rate of corrected sequences below 1% for the Oxford Nanopore Technologies (ONT) R9.4.1 and ONT R10.3 data, 0.2% for ONT R10.4.1, and 0.1% for high-accuracy ONT Duplex and Pacific Biosciences (PacBio) circular consensus sequencing (CCS) data in both simulated 16S rRNA and real 16-23S rRNA amplicon datasets. In high-accuracy PacBio CCS data, the clustering-based correction matched UMI correction, while outperforming 4× UMI correction in noisy ONT R10.3 and R9.4.1 data. Notably, LACA preserved phylogenetic fidelity in long operational taxonomic units and enhanced microbiome-wide phenotype characterization for synthetic mock communities and human vaginal samples.</p>\",\"PeriodicalId\":12909,\"journal\":{\"name\":\"Gut Microbes\",\"volume\":\"17 1\",\"pages\":\"2516703\"},\"PeriodicalIF\":12.2000,\"publicationDate\":\"2025-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12160608/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Gut Microbes\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1080/19490976.2025.2516703\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/6/11 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"GASTROENTEROLOGY & HEPATOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Gut Microbes","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1080/19490976.2025.2516703","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/11 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}
De novo clustering of long-read amplicons improves phylogenetic insight into microbiome data.
Long-read amplicon profiling through read classification limits phylogenetic analysis of amplicons while community analysis of multicopy genes, relying on unique molecular identifier (UMI) corrections, often demands deep sequencing. To address this, we present a long amplicon consensus analysis (LACA) workflow employing multiple de novo clustering approaches based on sequence dissimilarity. LACA controls the average error rate of corrected sequences below 1% for the Oxford Nanopore Technologies (ONT) R9.4.1 and ONT R10.3 data, 0.2% for ONT R10.4.1, and 0.1% for high-accuracy ONT Duplex and Pacific Biosciences (PacBio) circular consensus sequencing (CCS) data in both simulated 16S rRNA and real 16-23S rRNA amplicon datasets. In high-accuracy PacBio CCS data, the clustering-based correction matched UMI correction, while outperforming 4× UMI correction in noisy ONT R10.3 and R9.4.1 data. Notably, LACA preserved phylogenetic fidelity in long operational taxonomic units and enhanced microbiome-wide phenotype characterization for synthetic mock communities and human vaginal samples.
期刊介绍:
The intestinal microbiota plays a crucial role in human physiology, influencing various aspects of health and disease such as nutrition, obesity, brain function, allergic responses, immunity, inflammatory bowel disease, irritable bowel syndrome, cancer development, cardiac disease, liver disease, and more.
Gut Microbes serves as a platform for showcasing and discussing state-of-the-art research related to the microorganisms present in the intestine. The journal emphasizes mechanistic and cause-and-effect studies. Additionally, it has a counterpart, Gut Microbes Reports, which places a greater focus on emerging topics and comparative and incremental studies.