Joy Nathalie M. Avelino, Carmi Anne Loren Mora, J. P. Balaquit
{"title":"Ahead of the Curve: A Deeper Understanding of Network Threats Through Machine Learning","authors":"Joy Nathalie M. Avelino, Carmi Anne Loren Mora, J. P. Balaquit","doi":"10.1109/TENCON.2018.8650218","DOIUrl":null,"url":null,"abstract":"The role of big data and machine intelligence in the field of information security is gaining importance as malicious attackers use evasion techniques (polymorphism, encryption, obfuscation) to bypass signature-based detection. As most threats propagate through the network, it is important to have proactive techniques to discover an infection before it damages a computer.This paper will examine how header-based information as well as other characteristics in the HTTP network traffic can be used to train a machine learning model to capture malicious behavior.Network streams tagged as malicious are preprocessed and clustered. It has been found that features in the raw byte stream augmented with handcrafted features are useful in learning the characteristics of network threats.In specific clusters formed, it is possible to identify certain threats targeting a specific server, or if there are characteristics that can be observed in the injected code for exploit detection.Clustering malicious network traffic leads to a better understanding of protection against these types of threats, identification of connected malware campaigns, and insight on future trends.","PeriodicalId":132900,"journal":{"name":"TENCON 2018 - 2018 IEEE Region 10 Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"TENCON 2018 - 2018 IEEE Region 10 Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TENCON.2018.8650218","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The role of big data and machine intelligence in the field of information security is gaining importance as malicious attackers use evasion techniques (polymorphism, encryption, obfuscation) to bypass signature-based detection. As most threats propagate through the network, it is important to have proactive techniques to discover an infection before it damages a computer.This paper will examine how header-based information as well as other characteristics in the HTTP network traffic can be used to train a machine learning model to capture malicious behavior.Network streams tagged as malicious are preprocessed and clustered. It has been found that features in the raw byte stream augmented with handcrafted features are useful in learning the characteristics of network threats.In specific clusters formed, it is possible to identify certain threats targeting a specific server, or if there are characteristics that can be observed in the injected code for exploit detection.Clustering malicious network traffic leads to a better understanding of protection against these types of threats, identification of connected malware campaigns, and insight on future trends.