{"title":"Finding Heavy Hitters by Packet Count Flow Sampling","authors":"Z. Zhu, Hai Zhang, Wenming Guo","doi":"10.1109/ICCEE.2008.90","DOIUrl":null,"url":null,"abstract":"In many applications, ranging from network congestion monitoring to data mining, it is often desirable to identify from a large data set whose frequency is above a given threshold. This can help us find out the heaviest users, most popular web sites and so on.Our work focus on packet count heavy hitters finding problem , especially suite for Some attacks such as SYN flood and port scans. These kind of anomaly will not occupy much bandwidth, but still can affect the Internet seriously. A major difficulty with detecting heavy hitters on a high-speed monitoring point is that the traffic volume can contain millions of flows. So we present a threshold sampling technique. It can select large ones prior to small ones.Meanwhile, it can control the resources consumed by adjusting the threshold. The main procedures of this method is the source IP address base packet count aggregating and sorting. The experimental results show that heavy hitters from the sample approximate that from the original dataset, proofing that our method are effective.","PeriodicalId":365473,"journal":{"name":"2008 International Conference on Computer and Electrical Engineering","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 International Conference on Computer and Electrical Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCEE.2008.90","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In many applications, ranging from network congestion monitoring to data mining, it is often desirable to identify from a large data set whose frequency is above a given threshold. This can help us find out the heaviest users, most popular web sites and so on.Our work focus on packet count heavy hitters finding problem , especially suite for Some attacks such as SYN flood and port scans. These kind of anomaly will not occupy much bandwidth, but still can affect the Internet seriously. A major difficulty with detecting heavy hitters on a high-speed monitoring point is that the traffic volume can contain millions of flows. So we present a threshold sampling technique. It can select large ones prior to small ones.Meanwhile, it can control the resources consumed by adjusting the threshold. The main procedures of this method is the source IP address base packet count aggregating and sorting. The experimental results show that heavy hitters from the sample approximate that from the original dataset, proofing that our method are effective.