{"title":"CMIRGen:恶意网络流量签名自动生成算法","authors":"Runzi Zhang, Mingkai Tong, Lei Chen, Jianxin Xue, Wenmao Liu, Feng Xie","doi":"10.1109/TrustCom50675.2020.00101","DOIUrl":null,"url":null,"abstract":"Although machine learning (ML) based solutions are ever-evolving for the attack defending paradigm, signatures of malicious network traffic are vital resources for intrusion detection systems (IDSs) and network forensic procedure, covering the lack of interpretability and stability for ML models. However, signature extraction is still a time and labor consuming task nowadays, resulting in possible increase of the attackers' dwell time. Existing automatic solutions rely too much on sequence similarity based and heuristic based methods, encountering performance degradation in large scale and dynamic network environment. In this paper, we present a novel method, called Clustering and Model Inference-based Rule Generation (CMIRGen), automatically generating token-set based signature rules for malicious traffic payloads to be inspected. CMIRGen leverages both optimized sequence similarity based and black-box model inference based methods to extract patterns from homogeneous and heterogeneous payloads respectively. Experimental evaluations have been conducted on several datasets and show the CMIRGen framework can extract discriminative signatures, presenting high recall rate and low false positive rate at the same time for malicious content recognition.","PeriodicalId":221956,"journal":{"name":"2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CMIRGen: Automatic Signature Generation Algorithm for Malicious Network Traffic\",\"authors\":\"Runzi Zhang, Mingkai Tong, Lei Chen, Jianxin Xue, Wenmao Liu, Feng Xie\",\"doi\":\"10.1109/TrustCom50675.2020.00101\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Although machine learning (ML) based solutions are ever-evolving for the attack defending paradigm, signatures of malicious network traffic are vital resources for intrusion detection systems (IDSs) and network forensic procedure, covering the lack of interpretability and stability for ML models. However, signature extraction is still a time and labor consuming task nowadays, resulting in possible increase of the attackers' dwell time. Existing automatic solutions rely too much on sequence similarity based and heuristic based methods, encountering performance degradation in large scale and dynamic network environment. In this paper, we present a novel method, called Clustering and Model Inference-based Rule Generation (CMIRGen), automatically generating token-set based signature rules for malicious traffic payloads to be inspected. CMIRGen leverages both optimized sequence similarity based and black-box model inference based methods to extract patterns from homogeneous and heterogeneous payloads respectively. Experimental evaluations have been conducted on several datasets and show the CMIRGen framework can extract discriminative signatures, presenting high recall rate and low false positive rate at the same time for malicious content recognition.\",\"PeriodicalId\":221956,\"journal\":{\"name\":\"2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TrustCom50675.2020.00101\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TrustCom50675.2020.00101","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
CMIRGen: Automatic Signature Generation Algorithm for Malicious Network Traffic
Although machine learning (ML) based solutions are ever-evolving for the attack defending paradigm, signatures of malicious network traffic are vital resources for intrusion detection systems (IDSs) and network forensic procedure, covering the lack of interpretability and stability for ML models. However, signature extraction is still a time and labor consuming task nowadays, resulting in possible increase of the attackers' dwell time. Existing automatic solutions rely too much on sequence similarity based and heuristic based methods, encountering performance degradation in large scale and dynamic network environment. In this paper, we present a novel method, called Clustering and Model Inference-based Rule Generation (CMIRGen), automatically generating token-set based signature rules for malicious traffic payloads to be inspected. CMIRGen leverages both optimized sequence similarity based and black-box model inference based methods to extract patterns from homogeneous and heterogeneous payloads respectively. Experimental evaluations have been conducted on several datasets and show the CMIRGen framework can extract discriminative signatures, presenting high recall rate and low false positive rate at the same time for malicious content recognition.