GCI:用于完整基因组组装的连续性检查器。

Quanyu Chen, Chentao Yang, Guojie Zhang, Dongya Wu
{"title":"GCI:用于完整基因组组装的连续性检查器。","authors":"Quanyu Chen, Chentao Yang, Guojie Zhang, Dongya Wu","doi":"10.1093/bioinformatics/btae633","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Recent advances in long-read sequencing technologies have significantly facilitated the production of high-quality genome assembly. The telomere-to-telomere (T2T) gapless assembly has become the new golden standard of genome assembly efforts. Several recent efforts have claimed to produce T2T-level reference genomes. However, a universal standard is still missing to qualify a genome assembly to be at T2T standard. Traditional genome assembly assessment metrics (N50 and its derivatives) have no capacity in differentiating between nearly T2T assembly and the truly T2T assembly in continuity either globally or locally. Additionally, these metrics are independent of raw reads, making them inflated easily by artificial operations. Therefore, a gaplessness evaluation tool at single-nucleotide resolution to reflect true completeness is urgently needed in the era of complete genomes.</p><p><strong>Results: </strong>Here, we present a tool called Genome Continuity Inspector (GCI), designed to assess genome assembly continuity at single-base resolution, and evaluate how close an assembly is to the T2T level. GCI utilizes multiple aligners to map long reads from various sequencing platforms back to the assembly. By incorporating curated mapping coverage of high-confidence read alignments, GCI identifies potential assembly issues. Meanwhile, it provides GCI scores that quantify overall assembly continuity on the whole genome or chromosome scales.</p><p><strong>Availability and implementation: </strong>The open-source GCI code is freely available on Github (https://github.com/yeeus/GCI) under the MIT license.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11550331/pdf/","citationCount":"0","resultStr":"{\"title\":\"GCI: a continuity inspector for complete genome assembly.\",\"authors\":\"Quanyu Chen, Chentao Yang, Guojie Zhang, Dongya Wu\",\"doi\":\"10.1093/bioinformatics/btae633\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Motivation: </strong>Recent advances in long-read sequencing technologies have significantly facilitated the production of high-quality genome assembly. The telomere-to-telomere (T2T) gapless assembly has become the new golden standard of genome assembly efforts. Several recent efforts have claimed to produce T2T-level reference genomes. However, a universal standard is still missing to qualify a genome assembly to be at T2T standard. Traditional genome assembly assessment metrics (N50 and its derivatives) have no capacity in differentiating between nearly T2T assembly and the truly T2T assembly in continuity either globally or locally. Additionally, these metrics are independent of raw reads, making them inflated easily by artificial operations. Therefore, a gaplessness evaluation tool at single-nucleotide resolution to reflect true completeness is urgently needed in the era of complete genomes.</p><p><strong>Results: </strong>Here, we present a tool called Genome Continuity Inspector (GCI), designed to assess genome assembly continuity at single-base resolution, and evaluate how close an assembly is to the T2T level. GCI utilizes multiple aligners to map long reads from various sequencing platforms back to the assembly. By incorporating curated mapping coverage of high-confidence read alignments, GCI identifies potential assembly issues. Meanwhile, it provides GCI scores that quantify overall assembly continuity on the whole genome or chromosome scales.</p><p><strong>Availability and implementation: </strong>The open-source GCI code is freely available on Github (https://github.com/yeeus/GCI) under the MIT license.</p>\",\"PeriodicalId\":93899,\"journal\":{\"name\":\"Bioinformatics (Oxford, England)\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11550331/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioinformatics (Oxford, England)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/bioinformatics/btae633\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btae633","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

动机长读数测序技术的最新进展极大地促进了高质量基因组组装的产生。端粒到端粒(T2T)无间隙组装已成为基因组组装工作的新黄金标准。最近有几项工作声称能产生 T2T 水平的参考基因组。然而,目前仍缺乏一个通用标准来确定基因组组装是否达到 T2T 标准。传统的基因组组装评估指标(N50 及其衍生物)无法区分接近 T2T 组装和真正 T2T 组装的连续性,无论是在全球还是在本地。此外,这些指标与原始读数无关,很容易被人为操作夸大。因此,在全基因组时代,迫切需要一种单核苷酸分辨率的无间隙性评估工具来反映真正的完整性:在此,我们提出了一种名为基因组连续性检查器(GCI)的工具,旨在以单碱基分辨率评估基因组组装的连续性,并评估组装与 T2T 水平的接近程度。GCI 利用多个对齐器将来自不同测序平台的长读数映射回装配。通过结合高置信度读数对齐的策定映射覆盖率,GCI 可以识别潜在的组装问题。同时,它还提供 GCI 分数,量化全基因组或染色体范围内的整体组装连续性:开源 GCI 代码可在 Github (https://github.com/yeeus/GCI) 上免费获取,采用 MIT 许可。补充信息:补充数据可在 Bioinformatics online 上获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
GCI: a continuity inspector for complete genome assembly.

Motivation: Recent advances in long-read sequencing technologies have significantly facilitated the production of high-quality genome assembly. The telomere-to-telomere (T2T) gapless assembly has become the new golden standard of genome assembly efforts. Several recent efforts have claimed to produce T2T-level reference genomes. However, a universal standard is still missing to qualify a genome assembly to be at T2T standard. Traditional genome assembly assessment metrics (N50 and its derivatives) have no capacity in differentiating between nearly T2T assembly and the truly T2T assembly in continuity either globally or locally. Additionally, these metrics are independent of raw reads, making them inflated easily by artificial operations. Therefore, a gaplessness evaluation tool at single-nucleotide resolution to reflect true completeness is urgently needed in the era of complete genomes.

Results: Here, we present a tool called Genome Continuity Inspector (GCI), designed to assess genome assembly continuity at single-base resolution, and evaluate how close an assembly is to the T2T level. GCI utilizes multiple aligners to map long reads from various sequencing platforms back to the assembly. By incorporating curated mapping coverage of high-confidence read alignments, GCI identifies potential assembly issues. Meanwhile, it provides GCI scores that quantify overall assembly continuity on the whole genome or chromosome scales.

Availability and implementation: The open-source GCI code is freely available on Github (https://github.com/yeeus/GCI) under the MIT license.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信