Ning Li, Xin Yuan, José-Fernán Martínez, Vicente Hernández Díaz
{"title":"The Trapezoidal Sketch for Frequency Estimation in Network Flow","authors":"Ning Li, Xin Yuan, José-Fernán Martínez, Vicente Hernández Díaz","doi":"10.1109/ICNP52444.2021.9651949","DOIUrl":null,"url":null,"abstract":"The sketch is one of the typical and widely-used data structures for estimating the frequencies of items in data streams. However, since the counter sizes in traditional rectangular sketch (r-sketch) are the same, it is hard to achieve small space usage, high capacity (i.e., the maximum frequency can be recorded), and high estimated accuracy simultaneously. Moreover, when considering the high skewness of data streams, this problem will become even worse. Consequently, we propose the trapezoidal sketch (t-sketch) in this paper. In the t-sketch, different from the r-sketch, the counter sizes in different layers are different. Therefore, the low space usage and high capacity can be achieved simultaneously in the t-sketch. Moreover, based on the basic t-sketch, we propose the space-saving t-sketch and the capacity-improvement t-sketch, and analyze the properties of these two t-sketches. Compared with the CM sketch, CU sketch, C sketch, and A sketch, the simulation results show that the performances on space usage, capacity, and estimation accuracy are improved successfully by the space-saving t-sketch and the capacity-improvement t-sketch.","PeriodicalId":343813,"journal":{"name":"2021 IEEE 29th International Conference on Network Protocols (ICNP)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 29th International Conference on Network Protocols (ICNP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNP52444.2021.9651949","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The sketch is one of the typical and widely-used data structures for estimating the frequencies of items in data streams. However, since the counter sizes in traditional rectangular sketch (r-sketch) are the same, it is hard to achieve small space usage, high capacity (i.e., the maximum frequency can be recorded), and high estimated accuracy simultaneously. Moreover, when considering the high skewness of data streams, this problem will become even worse. Consequently, we propose the trapezoidal sketch (t-sketch) in this paper. In the t-sketch, different from the r-sketch, the counter sizes in different layers are different. Therefore, the low space usage and high capacity can be achieved simultaneously in the t-sketch. Moreover, based on the basic t-sketch, we propose the space-saving t-sketch and the capacity-improvement t-sketch, and analyze the properties of these two t-sketches. Compared with the CM sketch, CU sketch, C sketch, and A sketch, the simulation results show that the performances on space usage, capacity, and estimation accuracy are improved successfully by the space-saving t-sketch and the capacity-improvement t-sketch.