Marcin Ochab, Marcin Mrukowicz, J. Sarzynski, Urszula Bentkowska
{"title":"基于DNS协议分析的人为与机器产生的流量区分","authors":"Marcin Ochab, Marcin Mrukowicz, J. Sarzynski, Urszula Bentkowska","doi":"10.1109/FUZZ45933.2021.9494592","DOIUrl":null,"url":null,"abstract":"In this contribution we analyze a real DNS traffic collected at the University of Rzeszów campus. All DNS queries and responses observed in the entire network were gathered. Data include traffic generated by students, scholars, and other staff members as well as servers, IoT and all other devices connected to network. Data was collected using the Tshark network protocol analyzer and stored in a ClickHouse columnar-oriented database dedicated for high volume data analyses. Fuzzy C-means clustering was applied to analyze DNS traffic and to distinguish between human- and machine generated traffic. Analysis was performed on a representative sample containing 3 516 094 records and 33 proposed features.","PeriodicalId":151289,"journal":{"name":"2021 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE)","volume":"1244 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Human- and Machine-Generated Traffic Distinction by DNS Protocol Analysis\",\"authors\":\"Marcin Ochab, Marcin Mrukowicz, J. Sarzynski, Urszula Bentkowska\",\"doi\":\"10.1109/FUZZ45933.2021.9494592\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this contribution we analyze a real DNS traffic collected at the University of Rzeszów campus. All DNS queries and responses observed in the entire network were gathered. Data include traffic generated by students, scholars, and other staff members as well as servers, IoT and all other devices connected to network. Data was collected using the Tshark network protocol analyzer and stored in a ClickHouse columnar-oriented database dedicated for high volume data analyses. Fuzzy C-means clustering was applied to analyze DNS traffic and to distinguish between human- and machine generated traffic. Analysis was performed on a representative sample containing 3 516 094 records and 33 proposed features.\",\"PeriodicalId\":151289,\"journal\":{\"name\":\"2021 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE)\",\"volume\":\"1244 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FUZZ45933.2021.9494592\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FUZZ45933.2021.9494592","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Human- and Machine-Generated Traffic Distinction by DNS Protocol Analysis
In this contribution we analyze a real DNS traffic collected at the University of Rzeszów campus. All DNS queries and responses observed in the entire network were gathered. Data include traffic generated by students, scholars, and other staff members as well as servers, IoT and all other devices connected to network. Data was collected using the Tshark network protocol analyzer and stored in a ClickHouse columnar-oriented database dedicated for high volume data analyses. Fuzzy C-means clustering was applied to analyze DNS traffic and to distinguish between human- and machine generated traffic. Analysis was performed on a representative sample containing 3 516 094 records and 33 proposed features.