{"title":"ggm:利用配偶对在GPU上组装基因组","authors":"Ashutosh Jain, Anshuj Garg, K. Paul","doi":"10.1109/HiPC.2013.6799107","DOIUrl":null,"url":null,"abstract":"Genome fragment assembly has long been a time and computation intensive problem in the field of bioinformatics. Many parallel assemblers have been proposed to accelerate the process but there hasn't been any effective approach proposed for GPUs. Also with the increasing power of GPUs, applications from various research fields are being parallelized to take advantage of the massive number of “cores” available in GPUs. In this paper we present the design and development of a GPU based assembler (GAGM) for sequence assembly using Nvidia's GPUs with the CUDA programming model. Our assembler utilizes the mate pair reads produced by the current NGS technologies to build paired de Bruijn graph. Every paired read is broken into paired k-mers and l-mers. Every paired k-mer represents a vertex and paired l-mers are mapped as edges. Contigs are formed by grouping the regions of graph which can be unambiguously connected. We present parallel algorithms for k - mer extraction, paired de Bruijn graph construction and grouping of edges. We have benchmarked GAGM on four bacterial genomes. Our results show that the design on GPU is effective in terms of time as well as the quality of assembly produced.","PeriodicalId":206307,"journal":{"name":"20th Annual International Conference on High Performance Computing","volume":"137 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"GAGM: Genome assembly on GPU using mate pairs\",\"authors\":\"Ashutosh Jain, Anshuj Garg, K. Paul\",\"doi\":\"10.1109/HiPC.2013.6799107\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Genome fragment assembly has long been a time and computation intensive problem in the field of bioinformatics. Many parallel assemblers have been proposed to accelerate the process but there hasn't been any effective approach proposed for GPUs. Also with the increasing power of GPUs, applications from various research fields are being parallelized to take advantage of the massive number of “cores” available in GPUs. In this paper we present the design and development of a GPU based assembler (GAGM) for sequence assembly using Nvidia's GPUs with the CUDA programming model. Our assembler utilizes the mate pair reads produced by the current NGS technologies to build paired de Bruijn graph. Every paired read is broken into paired k-mers and l-mers. Every paired k-mer represents a vertex and paired l-mers are mapped as edges. Contigs are formed by grouping the regions of graph which can be unambiguously connected. We present parallel algorithms for k - mer extraction, paired de Bruijn graph construction and grouping of edges. We have benchmarked GAGM on four bacterial genomes. Our results show that the design on GPU is effective in terms of time as well as the quality of assembly produced.\",\"PeriodicalId\":206307,\"journal\":{\"name\":\"20th Annual International Conference on High Performance Computing\",\"volume\":\"137 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"20th Annual International Conference on High Performance Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HiPC.2013.6799107\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"20th Annual International Conference on High Performance Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HiPC.2013.6799107","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Genome fragment assembly has long been a time and computation intensive problem in the field of bioinformatics. Many parallel assemblers have been proposed to accelerate the process but there hasn't been any effective approach proposed for GPUs. Also with the increasing power of GPUs, applications from various research fields are being parallelized to take advantage of the massive number of “cores” available in GPUs. In this paper we present the design and development of a GPU based assembler (GAGM) for sequence assembly using Nvidia's GPUs with the CUDA programming model. Our assembler utilizes the mate pair reads produced by the current NGS technologies to build paired de Bruijn graph. Every paired read is broken into paired k-mers and l-mers. Every paired k-mer represents a vertex and paired l-mers are mapped as edges. Contigs are formed by grouping the regions of graph which can be unambiguously connected. We present parallel algorithms for k - mer extraction, paired de Bruijn graph construction and grouping of edges. We have benchmarked GAGM on four bacterial genomes. Our results show that the design on GPU is effective in terms of time as well as the quality of assembly produced.