混合和自校正方法的整合提高了长读数测序数据的质量。

IF 2.5 3区生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY

Briefings in Functional Genomics Pub Date : 2024-05-15 DOI:10.1093/bfgp/elad026

Tao Tang, Yiping Liu, Binshuang Zheng, Rong Li, Xiaocai Zhang, Yuansheng Liu

{"title":"混合和自校正方法的整合提高了长读数测序数据的质量。","authors":"Tao Tang, Yiping Liu, Binshuang Zheng, Rong Li, Xiaocai Zhang, Yuansheng Liu","doi":"10.1093/bfgp/elad026","DOIUrl":null,"url":null,"abstract":"Third-generation sequencing (TGS) technologies have revolutionized genome science in the past decade. However, the long-read data produced by TGS platforms suffer from a much higher error rate than that of the previous technologies, thus complicating the downstream analysis. Several error correction tools for long-read data have been developed; these tools can be categorized into hybrid and self-correction tools. So far, these two types of tools are separately investigated, and their interplay remains understudied. Here, we integrate hybrid and self-correction methods for high-quality error correction. Our procedure leverages the inter-similarity between long-read data and high-accuracy information from short reads. We compare the performance of our method and state-of-the-art error correction tools on Escherichia coli and Arabidopsis thaliana datasets. The result shows that the integration approach outperformed the existing error correction methods and holds promise for improving the quality of downstream analyses in genomic research.","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":"249-255"},"PeriodicalIF":2.5000,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Integration of hybrid and self-correction method improves the quality of long-read sequencing data.\",\"authors\":\"Tao Tang, Yiping Liu, Binshuang Zheng, Rong Li, Xiaocai Zhang, Yuansheng Liu\",\"doi\":\"10.1093/bfgp/elad026\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Third-generation sequencing (TGS) technologies have revolutionized genome science in the past decade. However, the long-read data produced by TGS platforms suffer from a much higher error rate than that of the previous technologies, thus complicating the downstream analysis. Several error correction tools for long-read data have been developed; these tools can be categorized into hybrid and self-correction tools. So far, these two types of tools are separately investigated, and their interplay remains understudied. Here, we integrate hybrid and self-correction methods for high-quality error correction. Our procedure leverages the inter-similarity between long-read data and high-accuracy information from short reads. We compare the performance of our method and state-of-the-art error correction tools on Escherichia coli and Arabidopsis thaliana datasets. The result shows that the integration approach outperformed the existing error correction methods and holds promise for improving the quality of downstream analyses in genomic research.\",\"PeriodicalId\":55323,\"journal\":{\"name\":\"Briefings in Functional Genomics\",\"volume\":\" \",\"pages\":\"249-255\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2024-05-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Briefings in Functional Genomics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/bfgp/elad026\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"BIOTECHNOLOGY & APPLIED MICROBIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Briefings in Functional Genomics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bfgp/elad026","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

过去十年间，第三代测序（TGS）技术给基因组科学带来了革命性的变化。然而，TGS 平台产生的长读数数据的错误率远高于之前的技术，从而使下游分析变得复杂。目前已开发出几种长读数数据纠错工具，可分为混合纠错工具和自我纠错工具。迄今为止，这两类工具是分开研究的，它们之间的相互作用仍未得到充分研究。在这里，我们整合了混合纠错和自我纠错方法，以实现高质量纠错。我们的程序利用了长读数数据与短读数高精度信息之间的相互相似性。我们在大肠杆菌和拟南芥数据集上比较了我们的方法和最先进的纠错工具的性能。结果表明，整合方法优于现有的纠错方法，有望提高基因组研究下游分析的质量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Integration of hybrid and self-correction method improves the quality of long-read sequencing data.

Third-generation sequencing (TGS) technologies have revolutionized genome science in the past decade. However, the long-read data produced by TGS platforms suffer from a much higher error rate than that of the previous technologies, thus complicating the downstream analysis. Several error correction tools for long-read data have been developed; these tools can be categorized into hybrid and self-correction tools. So far, these two types of tools are separately investigated, and their interplay remains understudied. Here, we integrate hybrid and self-correction methods for high-quality error correction. Our procedure leverages the inter-similarity between long-read data and high-accuracy information from short reads. We compare the performance of our method and state-of-the-art error correction tools on Escherichia coli and Arabidopsis thaliana datasets. The result shows that the integration approach outperformed the existing error correction methods and holds promise for improving the quality of downstream analyses in genomic research.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Briefings in Functional Genomics BIOTECHNOLOGY & APPLIED MICROBIOLOGY-GENETICS & HEREDITY

CiteScore

6.30

自引率

2.50%

发文量

审稿时长

6-12 weeks

期刊介绍： Briefings in Functional Genomics publishes high quality peer reviewed articles that focus on the use, development or exploitation of genomic approaches, and their application to all areas of biological research. As well as exploring thematic areas where these techniques and protocols are being used, articles review the impact that these approaches have had, or are likely to have, on their field. Subjects covered by the Journal include but are not restricted to: the identification and functional characterisation of coding and non-coding features in genomes, microarray technologies, gene expression profiling, next generation sequencing, pharmacogenomics, phenomics, SNP technologies, transgenic systems, mutation screens and genotyping. Articles range in scope and depth from the introductory level to specific details of protocols and analyses, encompassing bacterial, fungal, plant, animal and human data. The editorial board welcome the submission of review articles for publication. Essential criteria for the publication of papers is that they do not contain primary data, and that they are high quality, clearly written review articles which provide a balanced, highly informative and up to date perspective to researchers in the field of functional genomics.