Online Detection of Anomalous Heterogeneous Graphs with Streaming Edges

L. Akoglu
{"title":"Online Detection of Anomalous Heterogeneous Graphs with Streaming Edges","authors":"L. Akoglu","doi":"10.1109/ICDMW.2017.133","DOIUrl":null,"url":null,"abstract":"Given a stream of heterogeneous edges, comprising different types of nodes and edges, which arrive in an interleaved fashion to multiple different graphs evolving simultaneously, how can we spot the anomalous graphs in real-time using only constant memory? This problem is motivated by and generalizes from its application in security to host-level advanced persistent threat (APT) detection. In this talk, I will introduce STREAMSPOT, a clustering based anomaly detection approach for streaming heterogeneous graphs that addresses challenges in two key fronts: (1) heterogeneity, and (2) streaming nature. Specifically, we introduce a new similarity function for heterogeneous graphs that compares two graphs based on their relative frequency of local substructures, represented as short strings. This function lends itself to a vector representation of each graph, which is (a) fast to compute, and (b) amenable to a sketched version with bounded size that preserves the aforementioned similarity. STREAMSPOT exhibits desirable properties that a streaming application requires–it is (i) fully-streaming; processing the stream one edge at a time as it arrives, (ii) memory-efficient; requiring constant space for the sketches and the clustering, (iii) fast; taking constant time to update the graph sketches and the cluster summaries that can process over 100K edges per second, and (iv) online; scoring and flagging anomalies in real time. Experiments on datasets containing simulated system-call flow graphs from normal browser activity and various attack scenarios (ground truth) show that STREAMSPOT is high-performance; achieving above 95% detection accuracy with small delay, and competitive response time and memory usage.","PeriodicalId":389183,"journal":{"name":"2017 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Data Mining Workshops (ICDMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW.2017.133","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Given a stream of heterogeneous edges, comprising different types of nodes and edges, which arrive in an interleaved fashion to multiple different graphs evolving simultaneously, how can we spot the anomalous graphs in real-time using only constant memory? This problem is motivated by and generalizes from its application in security to host-level advanced persistent threat (APT) detection. In this talk, I will introduce STREAMSPOT, a clustering based anomaly detection approach for streaming heterogeneous graphs that addresses challenges in two key fronts: (1) heterogeneity, and (2) streaming nature. Specifically, we introduce a new similarity function for heterogeneous graphs that compares two graphs based on their relative frequency of local substructures, represented as short strings. This function lends itself to a vector representation of each graph, which is (a) fast to compute, and (b) amenable to a sketched version with bounded size that preserves the aforementioned similarity. STREAMSPOT exhibits desirable properties that a streaming application requires–it is (i) fully-streaming; processing the stream one edge at a time as it arrives, (ii) memory-efficient; requiring constant space for the sketches and the clustering, (iii) fast; taking constant time to update the graph sketches and the cluster summaries that can process over 100K edges per second, and (iv) online; scoring and flagging anomalies in real time. Experiments on datasets containing simulated system-call flow graphs from normal browser activity and various attack scenarios (ground truth) show that STREAMSPOT is high-performance; achieving above 95% detection accuracy with small delay, and competitive response time and memory usage.
带有流边的异常异构图的在线检测
给定一个由不同类型的节点和边组成的异构边流,它们以交错的方式到达多个不同的图,同时进化,我们如何仅使用恒定内存实时发现异常图?该问题的产生是由其在安全领域的应用到主机级高级持续威胁(APT)检测所引发的。在这次演讲中,我将介绍STREAMSPOT,这是一种基于聚类的流异构图异常检测方法,它解决了两个关键方面的挑战:(1)异质性,(2)流性质。具体来说,我们为异构图引入了一个新的相似函数,该函数基于局部子结构(表示为短字符串)的相对频率来比较两个图。这个函数适合于每个图的向量表示,它(a)计算速度快,(b)适用于具有有限大小的草图版本,以保持上述相似性。STREAMSPOT展示了流应用程序所需的理想属性——它是(i)完全流;当流到达时,一次处理一条边,(ii)内存效率高;为草图和聚类需要恒定的空间,(iii)快速;以恒定的时间更新图草图和聚类摘要,每秒可以处理超过100K条边,并且(iv)在线;实时对异常进行评分和标记。在包含来自正常浏览器活动和各种攻击场景的模拟系统调用流图的数据集上进行的实验表明,STREAMSPOT是高性能的;以较小的延迟实现95%以上的检测准确率,并且具有竞争力的响应时间和内存使用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信