{"title":"带有流边的异常异构图的在线检测","authors":"L. Akoglu","doi":"10.1109/ICDMW.2017.133","DOIUrl":null,"url":null,"abstract":"Given a stream of heterogeneous edges, comprising different types of nodes and edges, which arrive in an interleaved fashion to multiple different graphs evolving simultaneously, how can we spot the anomalous graphs in real-time using only constant memory? This problem is motivated by and generalizes from its application in security to host-level advanced persistent threat (APT) detection. In this talk, I will introduce STREAMSPOT, a clustering based anomaly detection approach for streaming heterogeneous graphs that addresses challenges in two key fronts: (1) heterogeneity, and (2) streaming nature. Specifically, we introduce a new similarity function for heterogeneous graphs that compares two graphs based on their relative frequency of local substructures, represented as short strings. This function lends itself to a vector representation of each graph, which is (a) fast to compute, and (b) amenable to a sketched version with bounded size that preserves the aforementioned similarity. STREAMSPOT exhibits desirable properties that a streaming application requires–it is (i) fully-streaming; processing the stream one edge at a time as it arrives, (ii) memory-efficient; requiring constant space for the sketches and the clustering, (iii) fast; taking constant time to update the graph sketches and the cluster summaries that can process over 100K edges per second, and (iv) online; scoring and flagging anomalies in real time. Experiments on datasets containing simulated system-call flow graphs from normal browser activity and various attack scenarios (ground truth) show that STREAMSPOT is high-performance; achieving above 95% detection accuracy with small delay, and competitive response time and memory usage.","PeriodicalId":389183,"journal":{"name":"2017 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Online Detection of Anomalous Heterogeneous Graphs with Streaming Edges\",\"authors\":\"L. Akoglu\",\"doi\":\"10.1109/ICDMW.2017.133\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Given a stream of heterogeneous edges, comprising different types of nodes and edges, which arrive in an interleaved fashion to multiple different graphs evolving simultaneously, how can we spot the anomalous graphs in real-time using only constant memory? This problem is motivated by and generalizes from its application in security to host-level advanced persistent threat (APT) detection. In this talk, I will introduce STREAMSPOT, a clustering based anomaly detection approach for streaming heterogeneous graphs that addresses challenges in two key fronts: (1) heterogeneity, and (2) streaming nature. Specifically, we introduce a new similarity function for heterogeneous graphs that compares two graphs based on their relative frequency of local substructures, represented as short strings. This function lends itself to a vector representation of each graph, which is (a) fast to compute, and (b) amenable to a sketched version with bounded size that preserves the aforementioned similarity. STREAMSPOT exhibits desirable properties that a streaming application requires–it is (i) fully-streaming; processing the stream one edge at a time as it arrives, (ii) memory-efficient; requiring constant space for the sketches and the clustering, (iii) fast; taking constant time to update the graph sketches and the cluster summaries that can process over 100K edges per second, and (iv) online; scoring and flagging anomalies in real time. Experiments on datasets containing simulated system-call flow graphs from normal browser activity and various attack scenarios (ground truth) show that STREAMSPOT is high-performance; achieving above 95% detection accuracy with small delay, and competitive response time and memory usage.\",\"PeriodicalId\":389183,\"journal\":{\"name\":\"2017 IEEE International Conference on Data Mining Workshops (ICDMW)\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE International Conference on Data Mining Workshops (ICDMW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDMW.2017.133\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Data Mining Workshops (ICDMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW.2017.133","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Online Detection of Anomalous Heterogeneous Graphs with Streaming Edges
Given a stream of heterogeneous edges, comprising different types of nodes and edges, which arrive in an interleaved fashion to multiple different graphs evolving simultaneously, how can we spot the anomalous graphs in real-time using only constant memory? This problem is motivated by and generalizes from its application in security to host-level advanced persistent threat (APT) detection. In this talk, I will introduce STREAMSPOT, a clustering based anomaly detection approach for streaming heterogeneous graphs that addresses challenges in two key fronts: (1) heterogeneity, and (2) streaming nature. Specifically, we introduce a new similarity function for heterogeneous graphs that compares two graphs based on their relative frequency of local substructures, represented as short strings. This function lends itself to a vector representation of each graph, which is (a) fast to compute, and (b) amenable to a sketched version with bounded size that preserves the aforementioned similarity. STREAMSPOT exhibits desirable properties that a streaming application requires–it is (i) fully-streaming; processing the stream one edge at a time as it arrives, (ii) memory-efficient; requiring constant space for the sketches and the clustering, (iii) fast; taking constant time to update the graph sketches and the cluster summaries that can process over 100K edges per second, and (iv) online; scoring and flagging anomalies in real time. Experiments on datasets containing simulated system-call flow graphs from normal browser activity and various attack scenarios (ground truth) show that STREAMSPOT is high-performance; achieving above 95% detection accuracy with small delay, and competitive response time and memory usage.