V. Bruschi, S. Pontarelli, Jerome Tollet, D. Barach, G. Bianchi
{"title":"DEMO: top-k cardinality estimation with HyperLogLog sketches","authors":"V. Bruschi, S. Pontarelli, Jerome Tollet, D. Barach, G. Bianchi","doi":"10.1109/ICIN51074.2021.9385549","DOIUrl":null,"url":null,"abstract":"A recurring task in security monitoring consists in finding scan-type flows, namely flows which exhibit a large cardinality in terms of number of distinct source/destination addresses, or in most generality packet-level identifiers (e.g. ports, header fields, etc). But cardinality estimation requires to “remember” the identifiers seen in the past, and becomes quite challenging when the goal is to implement per-flow distinct count at wire speed, while maintaining high processing throughput and limited memory footprint. In this demo, we will show how to use HyperLogLog sketches to implement an efficient and innovative top-k cardinality estimation algorithm, called FlowFight. The algorithm has been tested and integrated in a full-fledged software router such as Vector Packet Processor.","PeriodicalId":347933,"journal":{"name":"2021 24th Conference on Innovation in Clouds, Internet and Networks and Workshops (ICIN)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 24th Conference on Innovation in Clouds, Internet and Networks and Workshops (ICIN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIN51074.2021.9385549","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
A recurring task in security monitoring consists in finding scan-type flows, namely flows which exhibit a large cardinality in terms of number of distinct source/destination addresses, or in most generality packet-level identifiers (e.g. ports, header fields, etc). But cardinality estimation requires to “remember” the identifiers seen in the past, and becomes quite challenging when the goal is to implement per-flow distinct count at wire speed, while maintaining high processing throughput and limited memory footprint. In this demo, we will show how to use HyperLogLog sketches to implement an efficient and innovative top-k cardinality estimation algorithm, called FlowFight. The algorithm has been tested and integrated in a full-fledged software router such as Vector Packet Processor.