{"title":"基于tf - idf的端口访问统计分析捕获异常流量行为","authors":"K. Shima","doi":"10.1109/CCCI52664.2021.9583212","DOIUrl":null,"url":null,"abstract":"Detecting the anomalous behavior of traffic is one of the important actions for network operators. In this study, we applied term frequency – inverse document frequency (TF–IDF), which is a popular method used in natural language processing, to detect unusual behavior from network access logs. We mapped the term and document concept to the port number and daily access history, respectively, and calculated the TF–IDF. With this approach, we could obtain ports frequently observed in fewer days compared to other port access activities. Such access behaviors are not always malicious activities; however, such information is a good indicator for starting a deeper analysis of traffic behavior. Using a real-life dataset, we could detect two bot-oriented accesses and one unique UDP traffic.","PeriodicalId":136382,"journal":{"name":"2021 International Conference on Communications, Computing, Cybersecurity, and Informatics (CCCI)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Catching Unusual Traffic Behavior using TF–IDF-based Port Access Statistics Analysis\",\"authors\":\"K. Shima\",\"doi\":\"10.1109/CCCI52664.2021.9583212\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Detecting the anomalous behavior of traffic is one of the important actions for network operators. In this study, we applied term frequency – inverse document frequency (TF–IDF), which is a popular method used in natural language processing, to detect unusual behavior from network access logs. We mapped the term and document concept to the port number and daily access history, respectively, and calculated the TF–IDF. With this approach, we could obtain ports frequently observed in fewer days compared to other port access activities. Such access behaviors are not always malicious activities; however, such information is a good indicator for starting a deeper analysis of traffic behavior. Using a real-life dataset, we could detect two bot-oriented accesses and one unique UDP traffic.\",\"PeriodicalId\":136382,\"journal\":{\"name\":\"2021 International Conference on Communications, Computing, Cybersecurity, and Informatics (CCCI)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference on Communications, Computing, Cybersecurity, and Informatics (CCCI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCCI52664.2021.9583212\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Communications, Computing, Cybersecurity, and Informatics (CCCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCCI52664.2021.9583212","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Catching Unusual Traffic Behavior using TF–IDF-based Port Access Statistics Analysis
Detecting the anomalous behavior of traffic is one of the important actions for network operators. In this study, we applied term frequency – inverse document frequency (TF–IDF), which is a popular method used in natural language processing, to detect unusual behavior from network access logs. We mapped the term and document concept to the port number and daily access history, respectively, and calculated the TF–IDF. With this approach, we could obtain ports frequently observed in fewer days compared to other port access activities. Such access behaviors are not always malicious activities; however, such information is a good indicator for starting a deeper analysis of traffic behavior. Using a real-life dataset, we could detect two bot-oriented accesses and one unique UDP traffic.