{"title":"Protocol Keywords Extraction Method Based on Frequent Item-Sets Mining","authors":"Gaochao Li, Q. Qian, Zhonghua Wang, Xin Zou, Xunxun Chen, Xiao Wu","doi":"10.1145/3209914.3209937","DOIUrl":null,"url":null,"abstract":"Network application identification technology is widely used in the fields of network management, network optimization and intrusion detection and so on. And among the methods, the DPI (Deep Packet Inspection) is the most popular one with high accuracy relaying on a small amount of payload data. However, DPI depends on the effective protocol keywords. In order to cope with the speed of the applications updating, we proposed a protocol keywords extraction method for unencrypted network applications based on frequent itemsets mining. It contains two major steps: Firstly, we generate candidate words by using unsupervised methods and reduce the word set size with rules of words length and position. Then, we extract effective protocol keywords with frequent item-sets mining method and remove the noise words and redundant words by evaluating the candidate word co-occurrence relationship. The experiment result shows that our method shrinks the size of the keywords set and is better at extracting the real protocol keywords compared with Proword.","PeriodicalId":174382,"journal":{"name":"Proceedings of the 1st International Conference on Information Science and Systems","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 1st International Conference on Information Science and Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3209914.3209937","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Network application identification technology is widely used in the fields of network management, network optimization and intrusion detection and so on. And among the methods, the DPI (Deep Packet Inspection) is the most popular one with high accuracy relaying on a small amount of payload data. However, DPI depends on the effective protocol keywords. In order to cope with the speed of the applications updating, we proposed a protocol keywords extraction method for unencrypted network applications based on frequent itemsets mining. It contains two major steps: Firstly, we generate candidate words by using unsupervised methods and reduce the word set size with rules of words length and position. Then, we extract effective protocol keywords with frequent item-sets mining method and remove the noise words and redundant words by evaluating the candidate word co-occurrence relationship. The experiment result shows that our method shrinks the size of the keywords set and is better at extracting the real protocol keywords compared with Proword.