{"title":"Statistical network behavior based threat detection","authors":"Jin Cao, L. Drabeck, Ran He","doi":"10.1109/INFCOMW.2017.8116413","DOIUrl":null,"url":null,"abstract":"Malware, short for malicious software, contuses to morph and change. Traditional anti-virus software may have problems detecting malicious software that have not been seen before. By employing machine learning techniques, one can learn the general behavior patterns of different threat types and use these to detect variants of unknown threats. We have developed a malware detection system based on machine learning that uses features derived from a user's network flows to external hosts. A novel aspect of our technique is to separate hosts into different groups by how common they are visited by the users and then develop user features separately for each of these host groups. The network data for the training of the detector is based on malware samples that have been run in a sandbox and normal users' traffic that is collected from an LTE wireless network provider. Specifically, we use the Adaboost algorithm as the classification engine and obtain a good performance of 0.78% false alarm rate and 96.5% accuracy for detecting users infected with malwares. We also provide high and low confidence regions for our system based on subclasses of threats.","PeriodicalId":306731,"journal":{"name":"2017 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INFCOMW.2017.8116413","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Malware, short for malicious software, contuses to morph and change. Traditional anti-virus software may have problems detecting malicious software that have not been seen before. By employing machine learning techniques, one can learn the general behavior patterns of different threat types and use these to detect variants of unknown threats. We have developed a malware detection system based on machine learning that uses features derived from a user's network flows to external hosts. A novel aspect of our technique is to separate hosts into different groups by how common they are visited by the users and then develop user features separately for each of these host groups. The network data for the training of the detector is based on malware samples that have been run in a sandbox and normal users' traffic that is collected from an LTE wireless network provider. Specifically, we use the Adaboost algorithm as the classification engine and obtain a good performance of 0.78% false alarm rate and 96.5% accuracy for detecting users infected with malwares. We also provide high and low confidence regions for our system based on subclasses of threats.