Power-efficiency analysis of accelerated BWA-MEM implementations on heterogeneous computing platforms

2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig) Pub Date : 2016-11-01 DOI:10.1109/ReConFig.2016.7857181

Ernst Houtgast, V. Sima, G. Marchiori, K. Bertels, Z. Al-Ars

{"title":"Power-efficiency analysis of accelerated BWA-MEM implementations on heterogeneous computing platforms","authors":"Ernst Houtgast, V. Sima, G. Marchiori, K. Bertels, Z. Al-Ars","doi":"10.1109/ReConFig.2016.7857181","DOIUrl":null,"url":null,"abstract":"Next Generation Sequencing techniques have dramatically reduced the cost of sequencing genetic material, resulting in huge amounts of data being sequenced. The processing of this data poses huge challenges, both from a performance perspective, as well as from a power-efficiency perspective. Heterogeneous computing can help on both fronts, by enabling more performant and more power-efficient solutions. In this paper, power-efficiency of the BWA-MEM algorithm, a popular tool for genomic data mapping, is studied on two heterogeneous architectures. The performance and power-efficiency of an FPGA-based implementation using a single Xilinx Virtex-7 FPGA on the Alpha Data add-in card is compared to a GPU-based implementation using an NVIDIA GeForce GTX 970 and against the software-only baseline system. By offloading the Seed Extension phase on an accelerator, both implementations are able to achieve a two-fold speedup in overall application-level performance over the software-only implementation. Moreover, the highly customizable nature of the FPGA results in much higher power-efficiency, as the FPGA power consumption is less than one fourth of that of the GPU. To facilitate platform and tool-agnostic comparisons, the base pairs per Joule unit is introduced as a measure of power-efficiency. The FPGA design is able to map up to 44 thousand base pairs per Joule, a 2.1x gain in power-efficiency as compared to the software-only baseline.","PeriodicalId":431909,"journal":{"name":"2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ReConFig.2016.7857181","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 13

Abstract

Next Generation Sequencing techniques have dramatically reduced the cost of sequencing genetic material, resulting in huge amounts of data being sequenced. The processing of this data poses huge challenges, both from a performance perspective, as well as from a power-efficiency perspective. Heterogeneous computing can help on both fronts, by enabling more performant and more power-efficient solutions. In this paper, power-efficiency of the BWA-MEM algorithm, a popular tool for genomic data mapping, is studied on two heterogeneous architectures. The performance and power-efficiency of an FPGA-based implementation using a single Xilinx Virtex-7 FPGA on the Alpha Data add-in card is compared to a GPU-based implementation using an NVIDIA GeForce GTX 970 and against the software-only baseline system. By offloading the Seed Extension phase on an accelerator, both implementations are able to achieve a two-fold speedup in overall application-level performance over the software-only implementation. Moreover, the highly customizable nature of the FPGA results in much higher power-efficiency, as the FPGA power consumption is less than one fourth of that of the GPU. To facilitate platform and tool-agnostic comparisons, the base pairs per Joule unit is introduced as a measure of power-efficiency. The FPGA design is able to map up to 44 thousand base pairs per Joule, a 2.1x gain in power-efficiency as compared to the software-only baseline.

查看原文本刊更多论文

异构计算平台上加速BWA-MEM实现的能效分析

下一代测序技术大大降低了测序遗传物质的成本，导致大量数据被测序。这些数据的处理带来了巨大的挑战，无论是从性能的角度，还是从能效的角度。异构计算可以通过支持更高性能和更节能的解决方案在这两个方面提供帮助。本文在两种异构架构下研究了基因组数据映射的常用工具BWA-MEM算法的功率效率。在Alpha Data附加卡上使用单个Xilinx Virtex-7 FPGA的基于FPGA的实现与使用NVIDIA GeForce GTX 970的基于gpu的实现以及仅软件基线系统的性能和功耗进行了比较。通过在加速器上卸载Seed Extension阶段，两种实现都能够在整体应用程序级性能上实现比纯软件实现两倍的加速。此外，FPGA的高度可定制特性导致更高的功率效率，因为FPGA的功耗不到GPU的四分之一。为了方便与平台和工具无关的比较，引入了每焦耳单位的碱基对作为功率效率的度量。FPGA设计能够映射高达每焦耳4.4万个碱基对，与仅软件基准相比，功率效率提高2.1倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig)

自引率

0.00%

发文量