SK-Tree: a systematic malware detection algorithm on streaming trees via the signature kernel

Thomas Cochrane, Peter Foster, Varun Chhabra, M. Lemercier, C. Salvi, Terry Lyons
{"title":"SK-Tree: a systematic malware detection algorithm on streaming trees via the signature kernel","authors":"Thomas Cochrane, Peter Foster, Varun Chhabra, M. Lemercier, C. Salvi, Terry Lyons","doi":"10.1109/CSR51186.2021.9527933","DOIUrl":null,"url":null,"abstract":"The development of machine learning algorithms in the cyber security domain has been impeded by the complex, hierarchical, sequential and multimodal nature of the data involved. In this paper we introduce the notion of a streaming tree as a generic data structure encompassing a large portion of real-world cyber security data. Starting from host-based event logs we represent computer processes as streaming trees that evolve in continuous time. Leveraging the properties of the signature kernel, a machine learning tool that recently emerged as a leading technology for learning with complex sequences of data, we develop the SK-Tree algorithm. SK-Tree is a supervised learning method for systematic malware detection on streaming trees that is robust to irregular sampling and high dimensionality of the underlying streams. We demonstrate the effectiveness of SK-Tree to detect malicious events on a portion of the publicly available DARPA OpTC dataset, achieving an AUROC score of 98%.","PeriodicalId":253300,"journal":{"name":"2021 IEEE International Conference on Cyber Security and Resilience (CSR)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Cyber Security and Resilience (CSR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSR51186.2021.9527933","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13

Abstract

The development of machine learning algorithms in the cyber security domain has been impeded by the complex, hierarchical, sequential and multimodal nature of the data involved. In this paper we introduce the notion of a streaming tree as a generic data structure encompassing a large portion of real-world cyber security data. Starting from host-based event logs we represent computer processes as streaming trees that evolve in continuous time. Leveraging the properties of the signature kernel, a machine learning tool that recently emerged as a leading technology for learning with complex sequences of data, we develop the SK-Tree algorithm. SK-Tree is a supervised learning method for systematic malware detection on streaming trees that is robust to irregular sampling and high dimensionality of the underlying streams. We demonstrate the effectiveness of SK-Tree to detect malicious events on a portion of the publicly available DARPA OpTC dataset, achieving an AUROC score of 98%.
SK-Tree:通过签名内核对流树进行系统的恶意软件检测算法
机器学习算法在网络安全领域的发展一直受到所涉及数据的复杂性、层次性、顺序性和多模态性的阻碍。在本文中,我们介绍了流树的概念,它是一种通用的数据结构,包含了大部分现实世界的网络安全数据。从基于主机的事件日志开始,我们将计算机进程表示为连续时间演化的流树。利用签名内核(一种机器学习工具,最近成为学习复杂数据序列的领先技术)的特性,我们开发了SK-Tree算法。SK-Tree是一种监督学习方法,用于对流树进行系统恶意软件检测,对不规则采样和高维底层流具有鲁棒性。我们展示了SK-Tree在部分公开可用的DARPA OpTC数据集上检测恶意事件的有效性,达到了98%的AUROC分数。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信