流图上的分布式增量模式匹配

Jyun-Sheng Kao, J. Chou
{"title":"流图上的分布式增量模式匹配","authors":"Jyun-Sheng Kao, J. Chou","doi":"10.1145/2915516.2915519","DOIUrl":null,"url":null,"abstract":"Big data has shifted the computing paradigm of data analysis. While some of the data can be treated as simple texts or independent data records, many other applications have data with structural patterns which are modeled as a graph, such as social media, road network traffic and smart grid, etc. However, there is still limited amount of work has been done to address the velocity problem of graph processing. In this work, we aim to develop a distributed processing system for solving pattern matching queries on streaming graphs where graphs evolve over time upon the arrives of streaming graph update events. To achieve the goal, we proposed an incremental pattern matching algorithm and implemented it on GPS, a vertex centric distributed graph computing framework. We also extended the GPS framework to support streaming graph, and adapted a subgraphcentric data model to further reduce communication overhead and system performance. Our evaluation using real wiki trace shows that our approach achieves a 3x -- 10x speedup over the batch algorithm, and significantly reduces network and memory usage.","PeriodicalId":20568,"journal":{"name":"Proceedings of the ACM Workshop on High Performance Graph Processing","volume":"8 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Distributed Incremental Pattern Matching on Streaming Graphs\",\"authors\":\"Jyun-Sheng Kao, J. Chou\",\"doi\":\"10.1145/2915516.2915519\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Big data has shifted the computing paradigm of data analysis. While some of the data can be treated as simple texts or independent data records, many other applications have data with structural patterns which are modeled as a graph, such as social media, road network traffic and smart grid, etc. However, there is still limited amount of work has been done to address the velocity problem of graph processing. In this work, we aim to develop a distributed processing system for solving pattern matching queries on streaming graphs where graphs evolve over time upon the arrives of streaming graph update events. To achieve the goal, we proposed an incremental pattern matching algorithm and implemented it on GPS, a vertex centric distributed graph computing framework. We also extended the GPS framework to support streaming graph, and adapted a subgraphcentric data model to further reduce communication overhead and system performance. Our evaluation using real wiki trace shows that our approach achieves a 3x -- 10x speedup over the batch algorithm, and significantly reduces network and memory usage.\",\"PeriodicalId\":20568,\"journal\":{\"name\":\"Proceedings of the ACM Workshop on High Performance Graph Processing\",\"volume\":\"8 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-05-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the ACM Workshop on High Performance Graph Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2915516.2915519\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM Workshop on High Performance Graph Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2915516.2915519","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

摘要

大数据改变了数据分析的计算范式。虽然有些数据可以被视为简单的文本或独立的数据记录,但许多其他应用程序具有结构模式的数据,这些数据被建模为图形,例如社交媒体,道路网络交通和智能电网等。然而,在解决图形处理的速度问题上所做的工作仍然有限。在这项工作中,我们的目标是开发一个分布式处理系统,用于解决流图上的模式匹配查询,其中图随着时间的推移而随着流图更新事件的到来而演变。为了实现这一目标,我们提出了一种增量模式匹配算法,并在以顶点为中心的分布式图计算框架GPS上实现。我们还扩展了GPS框架来支持流图,并采用了以子图为中心的数据模型来进一步降低通信开销和系统性能。我们使用真实wiki跟踪的评估表明,我们的方法比批处理算法实现了3 - 10倍的加速,并显着减少了网络和内存使用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Distributed Incremental Pattern Matching on Streaming Graphs
Big data has shifted the computing paradigm of data analysis. While some of the data can be treated as simple texts or independent data records, many other applications have data with structural patterns which are modeled as a graph, such as social media, road network traffic and smart grid, etc. However, there is still limited amount of work has been done to address the velocity problem of graph processing. In this work, we aim to develop a distributed processing system for solving pattern matching queries on streaming graphs where graphs evolve over time upon the arrives of streaming graph update events. To achieve the goal, we proposed an incremental pattern matching algorithm and implemented it on GPS, a vertex centric distributed graph computing framework. We also extended the GPS framework to support streaming graph, and adapted a subgraphcentric data model to further reduce communication overhead and system performance. Our evaluation using real wiki trace shows that our approach achieves a 3x -- 10x speedup over the batch algorithm, and significantly reduces network and memory usage.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信