ggm:利用配偶对在GPU上组装基因组

Ashutosh Jain, Anshuj Garg, K. Paul
{"title":"ggm:利用配偶对在GPU上组装基因组","authors":"Ashutosh Jain, Anshuj Garg, K. Paul","doi":"10.1109/HiPC.2013.6799107","DOIUrl":null,"url":null,"abstract":"Genome fragment assembly has long been a time and computation intensive problem in the field of bioinformatics. Many parallel assemblers have been proposed to accelerate the process but there hasn't been any effective approach proposed for GPUs. Also with the increasing power of GPUs, applications from various research fields are being parallelized to take advantage of the massive number of “cores” available in GPUs. In this paper we present the design and development of a GPU based assembler (GAGM) for sequence assembly using Nvidia's GPUs with the CUDA programming model. Our assembler utilizes the mate pair reads produced by the current NGS technologies to build paired de Bruijn graph. Every paired read is broken into paired k-mers and l-mers. Every paired k-mer represents a vertex and paired l-mers are mapped as edges. Contigs are formed by grouping the regions of graph which can be unambiguously connected. We present parallel algorithms for k - mer extraction, paired de Bruijn graph construction and grouping of edges. We have benchmarked GAGM on four bacterial genomes. Our results show that the design on GPU is effective in terms of time as well as the quality of assembly produced.","PeriodicalId":206307,"journal":{"name":"20th Annual International Conference on High Performance Computing","volume":"137 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"GAGM: Genome assembly on GPU using mate pairs\",\"authors\":\"Ashutosh Jain, Anshuj Garg, K. Paul\",\"doi\":\"10.1109/HiPC.2013.6799107\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Genome fragment assembly has long been a time and computation intensive problem in the field of bioinformatics. Many parallel assemblers have been proposed to accelerate the process but there hasn't been any effective approach proposed for GPUs. Also with the increasing power of GPUs, applications from various research fields are being parallelized to take advantage of the massive number of “cores” available in GPUs. In this paper we present the design and development of a GPU based assembler (GAGM) for sequence assembly using Nvidia's GPUs with the CUDA programming model. Our assembler utilizes the mate pair reads produced by the current NGS technologies to build paired de Bruijn graph. Every paired read is broken into paired k-mers and l-mers. Every paired k-mer represents a vertex and paired l-mers are mapped as edges. Contigs are formed by grouping the regions of graph which can be unambiguously connected. We present parallel algorithms for k - mer extraction, paired de Bruijn graph construction and grouping of edges. We have benchmarked GAGM on four bacterial genomes. Our results show that the design on GPU is effective in terms of time as well as the quality of assembly produced.\",\"PeriodicalId\":206307,\"journal\":{\"name\":\"20th Annual International Conference on High Performance Computing\",\"volume\":\"137 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"20th Annual International Conference on High Performance Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HiPC.2013.6799107\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"20th Annual International Conference on High Performance Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HiPC.2013.6799107","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

基因组片段组装一直是生物信息学领域中耗时、计算量大的问题。人们提出了许多并行汇编器来加速这一过程,但还没有针对gpu提出任何有效的方法。此外,随着gpu的功能不断增强,各种研究领域的应用程序正在并行化,以利用gpu中可用的大量“内核”。在本文中,我们设计和开发了一个基于GPU的序列汇编器(GAGM),使用Nvidia的GPU和CUDA编程模型进行序列汇编。我们的汇编器利用当前NGS技术产生的配对对读取来构建配对de Bruijn图。每个成对的读音都被分成成对的k-mers和l-mers。每个成对的k-mer表示一个顶点,成对的l-mer映射为边。通过对图中可以明确连接的区域进行分组,形成组群。提出了k - mer提取、配对德布鲁因图构造和边分组的并行算法。我们已经在四种细菌基因组上对GAGM进行了基准测试。结果表明,在GPU上的设计在时间和生产质量方面是有效的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
GAGM: Genome assembly on GPU using mate pairs
Genome fragment assembly has long been a time and computation intensive problem in the field of bioinformatics. Many parallel assemblers have been proposed to accelerate the process but there hasn't been any effective approach proposed for GPUs. Also with the increasing power of GPUs, applications from various research fields are being parallelized to take advantage of the massive number of “cores” available in GPUs. In this paper we present the design and development of a GPU based assembler (GAGM) for sequence assembly using Nvidia's GPUs with the CUDA programming model. Our assembler utilizes the mate pair reads produced by the current NGS technologies to build paired de Bruijn graph. Every paired read is broken into paired k-mers and l-mers. Every paired k-mer represents a vertex and paired l-mers are mapped as edges. Contigs are formed by grouping the regions of graph which can be unambiguously connected. We present parallel algorithms for k - mer extraction, paired de Bruijn graph construction and grouping of edges. We have benchmarked GAGM on four bacterial genomes. Our results show that the design on GPU is effective in terms of time as well as the quality of assembly produced.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信