不同系统调用表示对入侵检测的影响

Log. J. IGPL Pub Date : 2020-10-30 DOI:10.1093/jigpal/jzaa058

Sarah Wunderlich, Markus Ring, D. Landes, A. Hotho

{"title":"不同系统调用表示对入侵检测的影响","authors":"Sarah Wunderlich, Markus Ring, D. Landes, A. Hotho","doi":"10.1093/jigpal/jzaa058","DOIUrl":null,"url":null,"abstract":"\n Over the years, artificial neural networks have been applied successfully in many areas including IT security. Yet, neural networks can only process continuous input data. This is particularly challenging for security-related, non-continuous data like system calls of an operating system. This work focuses on five different options to preprocess sequences of system calls so that they can be processed by neural networks. These input options are based on one-hot encodings and learning word2vec, GloVe or fastText representations of system calls. As an additional option, we analyse if mapping system calls to their respective kernel modules is an adequate generalization step for (i) replacing system calls or (ii) enhancing system call data with additional information regarding their context. When performing such preprocessing steps it is important to ensure that no relevant information is lost during the process. The overall objective of system call analysis in the context of IT security is to categorize a sequence of them as benign or malicious behavior. Therefore, this scenario is used to evaluate different system call representations in a classification task. Results indicate that a broader range of attacks can be detected when enriching system call representations with corresponding kernel module information. Prior learning of embeddings does not achieve significant improvements. This work is an extension of the work by Wunderlich et al. [1] published in Advances in Intelligent Systems and Computing (AISC, volume 951).","PeriodicalId":304915,"journal":{"name":"Log. J. IGPL","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"The Impact of Different System Call Representations on Intrusion Detection\",\"authors\":\"Sarah Wunderlich, Markus Ring, D. Landes, A. Hotho\",\"doi\":\"10.1093/jigpal/jzaa058\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n Over the years, artificial neural networks have been applied successfully in many areas including IT security. Yet, neural networks can only process continuous input data. This is particularly challenging for security-related, non-continuous data like system calls of an operating system. This work focuses on five different options to preprocess sequences of system calls so that they can be processed by neural networks. These input options are based on one-hot encodings and learning word2vec, GloVe or fastText representations of system calls. As an additional option, we analyse if mapping system calls to their respective kernel modules is an adequate generalization step for (i) replacing system calls or (ii) enhancing system call data with additional information regarding their context. When performing such preprocessing steps it is important to ensure that no relevant information is lost during the process. The overall objective of system call analysis in the context of IT security is to categorize a sequence of them as benign or malicious behavior. Therefore, this scenario is used to evaluate different system call representations in a classification task. Results indicate that a broader range of attacks can be detected when enriching system call representations with corresponding kernel module information. Prior learning of embeddings does not achieve significant improvements. This work is an extension of the work by Wunderlich et al. [1] published in Advances in Intelligent Systems and Computing (AISC, volume 951).\",\"PeriodicalId\":304915,\"journal\":{\"name\":\"Log. J. IGPL\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Log. J. IGPL\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/jigpal/jzaa058\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Log. J. IGPL","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/jigpal/jzaa058","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

多年来，人工神经网络在包括IT安全在内的许多领域都得到了成功的应用。然而，神经网络只能处理连续输入数据。这对于与安全相关的非连续数据(如操作系统的系统调用)尤其具有挑战性。这项工作的重点是五种不同的选项来预处理系统调用序列，以便它们可以被神经网络处理。这些输入选项基于one-hot编码和学习系统调用的word2vec、GloVe或fastText表示。作为一个额外的选项，我们分析了将系统调用映射到各自的内核模块是否是一个适当的泛化步骤，用于(i)替换系统调用或(ii)用有关其上下文的附加信息增强系统调用数据。在执行这些预处理步骤时，重要的是要确保在处理过程中没有丢失相关信息。在IT安全上下文中，系统调用分析的总体目标是将它们的序列分类为良性或恶意行为。因此，此场景用于评估分类任务中的不同系统调用表示。结果表明，当用相应的内核模块信息丰富系统调用表示时，可以检测到更大范围的攻击。先验的嵌入学习并没有取得显著的改进。这项工作是Wunderlich等人在智能系统和计算进展(AISC，卷951)上发表的工作的扩展。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

The Impact of Different System Call Representations on Intrusion Detection

Over the years, artificial neural networks have been applied successfully in many areas including IT security. Yet, neural networks can only process continuous input data. This is particularly challenging for security-related, non-continuous data like system calls of an operating system. This work focuses on five different options to preprocess sequences of system calls so that they can be processed by neural networks. These input options are based on one-hot encodings and learning word2vec, GloVe or fastText representations of system calls. As an additional option, we analyse if mapping system calls to their respective kernel modules is an adequate generalization step for (i) replacing system calls or (ii) enhancing system call data with additional information regarding their context. When performing such preprocessing steps it is important to ensure that no relevant information is lost during the process. The overall objective of system call analysis in the context of IT security is to categorize a sequence of them as benign or malicious behavior. Therefore, this scenario is used to evaluate different system call representations in a classification task. Results indicate that a broader range of attacks can be detected when enriching system call representations with corresponding kernel module information. Prior learning of embeddings does not achieve significant improvements. This work is an extension of the work by Wunderlich et al. [1] published in Advances in Intelligent Systems and Computing (AISC, volume 951).

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Log. J. IGPL

自引率

0.00%

发文量