A hardware accelerator for entropy estimation using the top-k most frequent elements

Javier E. Soto, Paulo Ubisse, Cecilia Hernández, M. Figueroa
{"title":"A hardware accelerator for entropy estimation using the top-k most frequent elements","authors":"Javier E. Soto, Paulo Ubisse, Cecilia Hernández, M. Figueroa","doi":"10.1109/DSD51259.2020.00032","DOIUrl":null,"url":null,"abstract":"Estimating the empirical entropy of the elements in a dataset is an important task in data analysis. In particular, empirical entropy can be effectively used to detect anomalies in network traffic. However, computing the empirical entropy of a large dataset is computationally expensive and requires a large amount of memory. This is particularly important in high-speed network traffic analysis, where computing the entropy of a data flow in real time requires using hardware accelerators with restricted on-chip memory and arithmetic resources. In this work, we propose a method to estimate the entropy using a streaming algorithm with sublinear space requirements. Our approach uses a sketch to estimate the frequency of the elements in the stream, and a priority queue to store the top-k most frequent elements. We show that our method can provide a good approximation of the entropy of the dataset, and present the design of a hardware accelerator that can compute the entropy of the stream with a throughput of one packet per clock cycle. Implemented on a Xilinx Zynq UltraScale + MPSoC ZCU102 FPGA, our accelerator can operate at line rates above 181 Gbps, consuming 511 mW and using less than 24% of the resources available on the device.","PeriodicalId":128527,"journal":{"name":"2020 23rd Euromicro Conference on Digital System Design (DSD)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 23rd Euromicro Conference on Digital System Design (DSD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSD51259.2020.00032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Estimating the empirical entropy of the elements in a dataset is an important task in data analysis. In particular, empirical entropy can be effectively used to detect anomalies in network traffic. However, computing the empirical entropy of a large dataset is computationally expensive and requires a large amount of memory. This is particularly important in high-speed network traffic analysis, where computing the entropy of a data flow in real time requires using hardware accelerators with restricted on-chip memory and arithmetic resources. In this work, we propose a method to estimate the entropy using a streaming algorithm with sublinear space requirements. Our approach uses a sketch to estimate the frequency of the elements in the stream, and a priority queue to store the top-k most frequent elements. We show that our method can provide a good approximation of the entropy of the dataset, and present the design of a hardware accelerator that can compute the entropy of the stream with a throughput of one packet per clock cycle. Implemented on a Xilinx Zynq UltraScale + MPSoC ZCU102 FPGA, our accelerator can operate at line rates above 181 Gbps, consuming 511 mW and using less than 24% of the resources available on the device.
一个使用top-k最频繁元素进行熵估计的硬件加速器
估计数据集中元素的经验熵是数据分析中的一项重要任务。特别是,经验熵可以有效地用于检测网络流量中的异常。然而,计算大型数据集的经验熵在计算上是昂贵的,并且需要大量的内存。这在高速网络流量分析中尤其重要,在高速网络流量分析中,实时计算数据流的熵需要使用具有有限片上内存和算术资源的硬件加速器。在这项工作中,我们提出了一种使用具有亚线性空间要求的流算法来估计熵的方法。我们的方法使用草图来估计流中元素的频率,并使用优先级队列来存储前k个最频繁的元素。我们证明了我们的方法可以很好地近似数据集的熵,并提出了一个硬件加速器的设计,该加速器可以以每个时钟周期一个数据包的吞吐量计算流的熵。在Xilinx Zynq UltraScale + MPSoC ZCU102 FPGA上实现,我们的加速器可以以超过181 Gbps的线路速率运行,消耗511 mW,使用不到24%的设备可用资源。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信