BunchBloomer: Cost-Effective Bloom Filter Accelerator for Genomics Applications

Seongyoung Kang, Tarun Sai Ganesh Nerella, Shashank Uppoor, S. Jun
{"title":"BunchBloomer: Cost-Effective Bloom Filter Accelerator for Genomics Applications","authors":"Seongyoung Kang, Tarun Sai Ganesh Nerella, Shashank Uppoor, S. Jun","doi":"10.1109/FPL57034.2022.00014","DOIUrl":null,"url":null,"abstract":"Bloom filters are a very important tool for many applications including genomics, where they are used as a compact data structure for counting k-mers, represent de Bruijn graphs, and more. Due to their random-access nature coupled with the large size required for genomics, Bloom filters for genomics can easily become bound by the random access performance of off-chip memory. This is especially true for accelerators such as FPGAs and GPUs, which can easily remove the computation overhead of the multiple hash functions. As a result, Bloom filter accelerators have typically focused either on small filters which can fit in fast on-chip memory, or require fast off-chip memory fabric such as Hybrid Memory Cubes. In this work, we present BunchBloomer, which improves the cost-effectiveness of FPGA Bloom filter accelerators by making better use of cheaper, lower-power DDR memory. BunchBloomer uses a multi-layer radix sorter to group table updates into bursts directed to the same 8 KiB memory region, which can be efficiently cached in on-chip memory. A single BunchBloomer device outperforms a costly 12-core server by over 2×, demonstrating an order of magnitude better power efficiency. It even achieves better power efficiency compared to published FPGA Bloom filter accelerators equipped with Hybrid Memory Cubes.","PeriodicalId":380116,"journal":{"name":"2022 32nd International Conference on Field-Programmable Logic and Applications (FPL)","volume":"2016 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 32nd International Conference on Field-Programmable Logic and Applications (FPL)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FPL57034.2022.00014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Bloom filters are a very important tool for many applications including genomics, where they are used as a compact data structure for counting k-mers, represent de Bruijn graphs, and more. Due to their random-access nature coupled with the large size required for genomics, Bloom filters for genomics can easily become bound by the random access performance of off-chip memory. This is especially true for accelerators such as FPGAs and GPUs, which can easily remove the computation overhead of the multiple hash functions. As a result, Bloom filter accelerators have typically focused either on small filters which can fit in fast on-chip memory, or require fast off-chip memory fabric such as Hybrid Memory Cubes. In this work, we present BunchBloomer, which improves the cost-effectiveness of FPGA Bloom filter accelerators by making better use of cheaper, lower-power DDR memory. BunchBloomer uses a multi-layer radix sorter to group table updates into bursts directed to the same 8 KiB memory region, which can be efficiently cached in on-chip memory. A single BunchBloomer device outperforms a costly 12-core server by over 2×, demonstrating an order of magnitude better power efficiency. It even achieves better power efficiency compared to published FPGA Bloom filter accelerators equipped with Hybrid Memory Cubes.
BunchBloomer:经济高效的Bloom过滤器加速器,用于基因组学应用
布隆过滤器是包括基因组学在内的许多应用程序的非常重要的工具,它们被用作计数k-mers的紧凑数据结构,表示de Bruijn图等。由于其随机访问特性加上基因组学所需的大尺寸,基因组学的Bloom过滤器很容易受到片外存储器随机访问性能的限制。对于fpga和gpu这样的加速器来说尤其如此,它们可以很容易地消除多个哈希函数的计算开销。因此,Bloom滤波器加速器通常要么专注于可以适应快速片上存储器的小型滤波器,要么需要快速片外存储器结构,如混合存储器立方体。在这项工作中,我们提出了BunchBloomer,它通过更好地利用更便宜,更低功耗的DDR内存来提高FPGA Bloom滤波器加速器的成本效益。BunchBloomer使用多层基数排序器将表更新分组为指向相同8 KiB内存区域的突发,可以有效地缓存在片上内存中。单个BunchBloomer设备的性能比昂贵的12核服务器高出2倍以上,显示出更高的功率效率。与已发布的配备混合内存立方体的FPGA Bloom滤波器加速器相比,它甚至实现了更好的功率效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信