Metric forensics: a multi-level approach for mining volatile graphs

Keith W. Henderson, Tina Eliassi-Rad, C. Faloutsos, L. Akoglu, Lei Li, Koji Maruhashi, B. Prakash, Hanghang Tong
{"title":"Metric forensics: a multi-level approach for mining volatile graphs","authors":"Keith W. Henderson, Tina Eliassi-Rad, C. Faloutsos, L. Akoglu, Lei Li, Koji Maruhashi, B. Prakash, Hanghang Tong","doi":"10.1145/1835804.1835828","DOIUrl":null,"url":null,"abstract":"Advances in data collection and storage capacity have made it increasingly possible to collect highly volatile graph data for analysis. Existing graph analysis techniques are not appropriate for such data, especially in cases where streaming or near-real-time results are required. An example that has drawn significant research interest is the cyber-security domain, where internet communication traces are collected and real-time discovery of events, behaviors, patterns, and anomalies is desired. We propose MetricForensics, a scalable framework for analysis of volatile graphs. MetricForensics combines a multi-level \"drill down\" approach, a collection of user-selected graph metrics, and a collection of analysis techniques. At each successive level, more sophisticated metrics are computed and the graph is viewed at finer temporal resolutions. In this way, MetricForensics scales to highly volatile graphs by only allocating resources for computationally expensive analysis when an interesting event is discovered at a coarser resolution first. We test MetricForensics on three real-world graphs: an enterprise IP trace, a trace of legitimate and malicious network traffic from a research institution, and the MIT Reality Mining proximity sensor data. Our largest graph has 3M vertices and 32M edges, spanning 4.5 days. The results demonstrate the scalability and capability of MetricForensics in analyzing volatile graphs; and highlight four novel phenomena in such graphs: elbows, broken correlations, prolonged spikes, and lightweight stars.","PeriodicalId":20529,"journal":{"name":"Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"71","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1835804.1835828","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 71

Abstract

Advances in data collection and storage capacity have made it increasingly possible to collect highly volatile graph data for analysis. Existing graph analysis techniques are not appropriate for such data, especially in cases where streaming or near-real-time results are required. An example that has drawn significant research interest is the cyber-security domain, where internet communication traces are collected and real-time discovery of events, behaviors, patterns, and anomalies is desired. We propose MetricForensics, a scalable framework for analysis of volatile graphs. MetricForensics combines a multi-level "drill down" approach, a collection of user-selected graph metrics, and a collection of analysis techniques. At each successive level, more sophisticated metrics are computed and the graph is viewed at finer temporal resolutions. In this way, MetricForensics scales to highly volatile graphs by only allocating resources for computationally expensive analysis when an interesting event is discovered at a coarser resolution first. We test MetricForensics on three real-world graphs: an enterprise IP trace, a trace of legitimate and malicious network traffic from a research institution, and the MIT Reality Mining proximity sensor data. Our largest graph has 3M vertices and 32M edges, spanning 4.5 days. The results demonstrate the scalability and capability of MetricForensics in analyzing volatile graphs; and highlight four novel phenomena in such graphs: elbows, broken correlations, prolonged spikes, and lightweight stars.
度量取证:用于挖掘易变图的多层次方法
数据收集和存储容量的进步使得越来越有可能收集高度易变的图形数据进行分析。现有的图形分析技术不适合这样的数据,特别是在需要流或近实时结果的情况下。引起重大研究兴趣的一个例子是网络安全领域,在该领域,收集互联网通信痕迹,并希望实时发现事件、行为、模式和异常。我们提出MetricForensics,一个可扩展的框架,用于分析易变图。MetricForensics结合了多层次的“向下钻取”方法、用户选择的图形指标集合和分析技术集合。在每个连续的级别上,计算更复杂的度量,并以更精细的时间分辨率查看图形。通过这种方式,当首先以较粗的分辨率发现感兴趣的事件时,MetricForensics仅为计算成本较高的分析分配资源,从而扩展到高度易变的图。我们在三个真实世界的图表上测试了MetricForensics:一个企业IP跟踪,一个来自研究机构的合法和恶意网络流量跟踪,以及麻省理工学院现实挖掘接近传感器数据。我们最大的图有3M个顶点和32M条边,跨度4.5天。结果证明了MetricForensics在分析易变图方面的可扩展性和能力;并在这些图表中强调四种新现象:肘部、破相关、长时间尖峰和轻恒星。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信