Accelerating massive short reads mapping for next generation sequencing (abstract only)

Chunming Zhang, Wen Tang, Guangming Tan
{"title":"Accelerating massive short reads mapping for next generation sequencing (abstract only)","authors":"Chunming Zhang, Wen Tang, Guangming Tan","doi":"10.1145/2554688.2554707","DOIUrl":null,"url":null,"abstract":"Due to the explosion of gene sequencing data with over one billion reads per run, the data-intensive computations of Next Generation Sequencing (NGS) applications pose great challenges to current computing capability. In this paper we investigate both algorithmic and architectural accelerating strategies to a typical NGS analysis algorithm -- short reads mapping -- on a commodity multicore and customizable FPGA coprocessor architecture, respectively. First, we propose a hash buckets reorder algorithm that increases shared cache parallelism during the course of searching hash index. The algorithmic strategy achieves 122Gbp/day throughput by exploiting shared-cache parallelism, that leads to performance improvement of 2 times on an 8-core Intel Xeon processor. Second, we develop a FPGA coprocessor that leverages both bit-level and word-level parallelism with scatter-gather memory mechanism to speedup inherent irregular memory access operations by increasing effective memory bandwidth. Our customized FPGA coprocessor achieves 947Gbp per day throughput, that is 189 times higher than current mapping tools on single CPU core, and above 2 times higher than a 64-core multi-processor system. The coprocessor's power efficiency is 29 times higher than a conventional 64-core multi-processor. The results indicate that the customized FPGA coprocessor architecture, that is configured with scatter-gather memory's word-level access, appeals to data intensive applications.","PeriodicalId":390562,"journal":{"name":"Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2554688.2554707","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Due to the explosion of gene sequencing data with over one billion reads per run, the data-intensive computations of Next Generation Sequencing (NGS) applications pose great challenges to current computing capability. In this paper we investigate both algorithmic and architectural accelerating strategies to a typical NGS analysis algorithm -- short reads mapping -- on a commodity multicore and customizable FPGA coprocessor architecture, respectively. First, we propose a hash buckets reorder algorithm that increases shared cache parallelism during the course of searching hash index. The algorithmic strategy achieves 122Gbp/day throughput by exploiting shared-cache parallelism, that leads to performance improvement of 2 times on an 8-core Intel Xeon processor. Second, we develop a FPGA coprocessor that leverages both bit-level and word-level parallelism with scatter-gather memory mechanism to speedup inherent irregular memory access operations by increasing effective memory bandwidth. Our customized FPGA coprocessor achieves 947Gbp per day throughput, that is 189 times higher than current mapping tools on single CPU core, and above 2 times higher than a 64-core multi-processor system. The coprocessor's power efficiency is 29 times higher than a conventional 64-core multi-processor. The results indicate that the customized FPGA coprocessor architecture, that is configured with scatter-gather memory's word-level access, appeals to data intensive applications.
为下一代测序加速大规模短读段映射(仅摘要)
由于基因测序数据的爆炸式增长,每次运行读取量超过10亿次,下一代测序(NGS)应用程序的数据密集型计算对当前的计算能力提出了巨大挑战。在本文中,我们分别在商品多核和可定制的FPGA协处理器架构上研究了典型NGS分析算法(短读取映射)的算法和架构加速策略。首先,我们提出了一种哈希桶重排序算法,该算法在搜索哈希索引的过程中增加了共享缓存的并行性。该算法策略通过利用共享缓存并行性实现了122Gbp/天的吞吐量,这使得在8核英特尔至强处理器上的性能提高了2倍。其次,我们开发了一种FPGA协处理器,利用位级和字级并行性以及散射-收集存储器机制,通过增加有效存储器带宽来加速固有的不规则存储器访问操作。我们定制的FPGA协处理器实现了每天947Gbp的吞吐量,是目前单核CPU映射工具的189倍,是64核多处理器系统的2倍以上。协处理器的能效是传统64核多处理器的29倍。结果表明,配置了散集存储器字级访问的定制FPGA协处理器架构适合于数据密集型应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信