Marcin Ochab, Marcin Mrukowicz, J. Sarzynski, Urszula Bentkowska
{"title":"Human- and Machine-Generated Traffic Distinction by DNS Protocol Analysis","authors":"Marcin Ochab, Marcin Mrukowicz, J. Sarzynski, Urszula Bentkowska","doi":"10.1109/FUZZ45933.2021.9494592","DOIUrl":null,"url":null,"abstract":"In this contribution we analyze a real DNS traffic collected at the University of Rzeszów campus. All DNS queries and responses observed in the entire network were gathered. Data include traffic generated by students, scholars, and other staff members as well as servers, IoT and all other devices connected to network. Data was collected using the Tshark network protocol analyzer and stored in a ClickHouse columnar-oriented database dedicated for high volume data analyses. Fuzzy C-means clustering was applied to analyze DNS traffic and to distinguish between human- and machine generated traffic. Analysis was performed on a representative sample containing 3 516 094 records and 33 proposed features.","PeriodicalId":151289,"journal":{"name":"2021 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE)","volume":"1244 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FUZZ45933.2021.9494592","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this contribution we analyze a real DNS traffic collected at the University of Rzeszów campus. All DNS queries and responses observed in the entire network were gathered. Data include traffic generated by students, scholars, and other staff members as well as servers, IoT and all other devices connected to network. Data was collected using the Tshark network protocol analyzer and stored in a ClickHouse columnar-oriented database dedicated for high volume data analyses. Fuzzy C-means clustering was applied to analyze DNS traffic and to distinguish between human- and machine generated traffic. Analysis was performed on a representative sample containing 3 516 094 records and 33 proposed features.