在黑暗深处——基于深度学习的恶意软件流量检测，无需专家知识

2019 IEEE Security and Privacy Workshops (SPW) Pub Date : 2019-05-19 DOI:10.1109/SPW.2019.00019

Gonzalo Marín, P. Casas, G. Capdehourat

{"title":"在黑暗深处——基于深度学习的恶意软件流量检测，无需专家知识","authors":"Gonzalo Marín, P. Casas, G. Capdehourat","doi":"10.1109/SPW.2019.00019","DOIUrl":null,"url":null,"abstract":"With the ever-growing occurrence of networking attacks, robust network security systems are essential to prevent and mitigate their harming effects. In recent years, machine learning-based systems have gain popularity for network security applications, usually considering the application of shallow models, where a set of expert handcrafted features are needed to pre-process the data before training. The main problem with this approach is that handcrafted features can fail to perform well given different kinds of scenarios and problems. Deep Learning models can solve this kind of issues using their ability to learn feature representations from input raw or basic, non-processed data. In this paper we explore the power of deep learning models on the specific problem of detection and classification of malware network traffic, using different representations for the input data. As a major advantage as compared to the state of the art, we consider raw measurements coming directly from the stream of monitored bytes as the input to the proposed models, and evaluate different raw-traffic feature representations, including packet and flow-level ones. Our results suggest that deep learning models can better capture the underlying statistics of malicious traffic as compared to classical, shallow-like models, even while operating in the dark, i.e., without any sort of expert handcrafted inputs.","PeriodicalId":125351,"journal":{"name":"2019 IEEE Security and Privacy Workshops (SPW)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":"{\"title\":\"Deep in the Dark - Deep Learning-Based Malware Traffic Detection Without Expert Knowledge\",\"authors\":\"Gonzalo Marín, P. Casas, G. Capdehourat\",\"doi\":\"10.1109/SPW.2019.00019\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the ever-growing occurrence of networking attacks, robust network security systems are essential to prevent and mitigate their harming effects. In recent years, machine learning-based systems have gain popularity for network security applications, usually considering the application of shallow models, where a set of expert handcrafted features are needed to pre-process the data before training. The main problem with this approach is that handcrafted features can fail to perform well given different kinds of scenarios and problems. Deep Learning models can solve this kind of issues using their ability to learn feature representations from input raw or basic, non-processed data. In this paper we explore the power of deep learning models on the specific problem of detection and classification of malware network traffic, using different representations for the input data. As a major advantage as compared to the state of the art, we consider raw measurements coming directly from the stream of monitored bytes as the input to the proposed models, and evaluate different raw-traffic feature representations, including packet and flow-level ones. Our results suggest that deep learning models can better capture the underlying statistics of malicious traffic as compared to classical, shallow-like models, even while operating in the dark, i.e., without any sort of expert handcrafted inputs.\",\"PeriodicalId\":125351,\"journal\":{\"name\":\"2019 IEEE Security and Privacy Workshops (SPW)\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"22\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE Security and Privacy Workshops (SPW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SPW.2019.00019\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE Security and Privacy Workshops (SPW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPW.2019.00019","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 22

摘要

随着网络攻击的不断发生，强大的网络安全系统对于预防和减轻网络攻击的危害至关重要。近年来，基于机器学习的系统在网络安全应用中越来越受欢迎，通常考虑浅模型的应用，在训练前需要一组专家手工制作的特征对数据进行预处理。这种方法的主要问题是，在不同的场景和问题下，手工制作的功能可能无法很好地执行。深度学习模型可以利用它们从输入的原始或基本、未处理的数据中学习特征表示的能力来解决这类问题。在本文中，我们探索了深度学习模型在恶意软件网络流量检测和分类的具体问题上的能力，使用不同的输入数据表示。与目前的技术相比，我们的一个主要优势是，我们将直接来自监控字节流的原始测量作为所提议模型的输入，并评估不同的原始流量特征表示，包括数据包和流级别的特征表示。我们的研究结果表明，与经典的浅层模型相比，深度学习模型可以更好地捕获恶意流量的底层统计数据，即使在黑暗中运行，也就是说，没有任何专家手工制作的输入。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Deep in the Dark - Deep Learning-Based Malware Traffic Detection Without Expert Knowledge

With the ever-growing occurrence of networking attacks, robust network security systems are essential to prevent and mitigate their harming effects. In recent years, machine learning-based systems have gain popularity for network security applications, usually considering the application of shallow models, where a set of expert handcrafted features are needed to pre-process the data before training. The main problem with this approach is that handcrafted features can fail to perform well given different kinds of scenarios and problems. Deep Learning models can solve this kind of issues using their ability to learn feature representations from input raw or basic, non-processed data. In this paper we explore the power of deep learning models on the specific problem of detection and classification of malware network traffic, using different representations for the input data. As a major advantage as compared to the state of the art, we consider raw measurements coming directly from the stream of monitored bytes as the input to the proposed models, and evaluate different raw-traffic feature representations, including packet and flow-level ones. Our results suggest that deep learning models can better capture the underlying statistics of malicious traffic as compared to classical, shallow-like models, even while operating in the dark, i.e., without any sort of expert handcrafted inputs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 IEEE Security and Privacy Workshops (SPW)

自引率

0.00%

发文量