M. Mongiovì, Petko Bogdanov, R. Ranca, Ambuj K. Singh, E. Papalexakis, C. Faloutsos
{"title":"SigSpot:从时间演化的网络中挖掘重要的异常区域(仅抽象)","authors":"M. Mongiovì, Petko Bogdanov, R. Ranca, Ambuj K. Singh, E. Papalexakis, C. Faloutsos","doi":"10.1145/2213836.2213974","DOIUrl":null,"url":null,"abstract":"Anomaly detection in dynamic networks has a rich gamut of application domains, such as road networks, communication networks and water distribution networks. An anomalous event, such as a traffic accident, denial of service attack or a chemical spill, can cause a local shift from normal behavior in the network state that persists over an interval of time. Detecting such anomalous regions of network and time extent in large real-world networks is a challenging task. Existing anomaly detection techniques focus on either the time series associated with individual network edges or on global anomalies that affect the entire network. In order to detect anomalous regions, one needs to consider both the time and the affected network substructure jointly, which brings forth computational challenges due to the combinatorial nature of possible solutions. We propose the problem of mining all Significant Anomalous Regions (SAR) in time-evolving networks that asks for the discovery of connected temporal subgraphs comprised of edges that significantly deviate from normal in a persistent manner. We propose an optimal Baseline algorithm for the problem and an efficient approximation, called S IG S POT. Compared to Baseline, SIGSPOT is up to one order of magnitude faster in real data, while achieving less than 10% average relative error rate. In synthetic datasets it is more than 30 times faster than Baseline with 94% accuracy and solves efficiently large instances that are infeasible (more than 10 hours running time) for Baseline. We demonstrate the utility of SIGSPOT for inferring accidents on road networks and study its scalability when detecting anomalies in social, transportation and synthetic evolving networks, spanning up to 1GB.","PeriodicalId":212616,"journal":{"name":"Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"SigSpot: mining significant anomalous regions from time-evolving networks (abstract only)\",\"authors\":\"M. Mongiovì, Petko Bogdanov, R. Ranca, Ambuj K. Singh, E. Papalexakis, C. Faloutsos\",\"doi\":\"10.1145/2213836.2213974\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Anomaly detection in dynamic networks has a rich gamut of application domains, such as road networks, communication networks and water distribution networks. An anomalous event, such as a traffic accident, denial of service attack or a chemical spill, can cause a local shift from normal behavior in the network state that persists over an interval of time. Detecting such anomalous regions of network and time extent in large real-world networks is a challenging task. Existing anomaly detection techniques focus on either the time series associated with individual network edges or on global anomalies that affect the entire network. In order to detect anomalous regions, one needs to consider both the time and the affected network substructure jointly, which brings forth computational challenges due to the combinatorial nature of possible solutions. We propose the problem of mining all Significant Anomalous Regions (SAR) in time-evolving networks that asks for the discovery of connected temporal subgraphs comprised of edges that significantly deviate from normal in a persistent manner. We propose an optimal Baseline algorithm for the problem and an efficient approximation, called S IG S POT. Compared to Baseline, SIGSPOT is up to one order of magnitude faster in real data, while achieving less than 10% average relative error rate. In synthetic datasets it is more than 30 times faster than Baseline with 94% accuracy and solves efficiently large instances that are infeasible (more than 10 hours running time) for Baseline. We demonstrate the utility of SIGSPOT for inferring accidents on road networks and study its scalability when detecting anomalies in social, transportation and synthetic evolving networks, spanning up to 1GB.\",\"PeriodicalId\":212616,\"journal\":{\"name\":\"Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2213836.2213974\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2213836.2213974","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
摘要
动态网络异常检测具有广泛的应用领域,如道路网络、通信网络和供水网络等。异常事件,如交通事故、拒绝服务攻击或化学品泄漏,可能会导致网络状态中的局部行为在一段时间内持续变化。在现实世界的大型网络中检测网络的异常区域和时间范围是一项具有挑战性的任务。现有的异常检测技术要么关注与单个网络边缘相关的时间序列,要么关注影响整个网络的全局异常。为了检测异常区域,需要同时考虑时间和受影响的网络子结构,由于可能解的组合性,这给计算带来了挑战。我们提出了在时间进化网络中挖掘所有显著异常区域(SAR)的问题,该问题要求发现由持续显著偏离正态的边组成的连接时间子图。我们提出了一个最优的基线算法和一个有效的近似,称为S - IG - S - POT。与基线相比,SIGSPOT在实际数据中的速度快了一个数量级,同时平均相对错误率低于10%。在合成数据集中,它比Baseline快30倍以上,准确率达到94%,并有效地解决了Baseline无法实现的大型实例(运行时间超过10小时)。我们展示了SIGSPOT在推断道路网络事故方面的实用性,并研究了其在检测社交、交通和合成进化网络异常时的可扩展性,其覆盖范围高达1GB。
SigSpot: mining significant anomalous regions from time-evolving networks (abstract only)
Anomaly detection in dynamic networks has a rich gamut of application domains, such as road networks, communication networks and water distribution networks. An anomalous event, such as a traffic accident, denial of service attack or a chemical spill, can cause a local shift from normal behavior in the network state that persists over an interval of time. Detecting such anomalous regions of network and time extent in large real-world networks is a challenging task. Existing anomaly detection techniques focus on either the time series associated with individual network edges or on global anomalies that affect the entire network. In order to detect anomalous regions, one needs to consider both the time and the affected network substructure jointly, which brings forth computational challenges due to the combinatorial nature of possible solutions. We propose the problem of mining all Significant Anomalous Regions (SAR) in time-evolving networks that asks for the discovery of connected temporal subgraphs comprised of edges that significantly deviate from normal in a persistent manner. We propose an optimal Baseline algorithm for the problem and an efficient approximation, called S IG S POT. Compared to Baseline, SIGSPOT is up to one order of magnitude faster in real data, while achieving less than 10% average relative error rate. In synthetic datasets it is more than 30 times faster than Baseline with 94% accuracy and solves efficiently large instances that are infeasible (more than 10 hours running time) for Baseline. We demonstrate the utility of SIGSPOT for inferring accidents on road networks and study its scalability when detecting anomalies in social, transportation and synthetic evolving networks, spanning up to 1GB.