A Twig-Based Algorithm for Top-k Subgraph Matching in Large-Scale Graph Data

IF 3.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Haiwei Zhang , Qijie Bai , Yining Lian , Yanlong Wen
{"title":"A Twig-Based Algorithm for Top-k Subgraph Matching in Large-Scale Graph Data","authors":"Haiwei Zhang ,&nbsp;Qijie Bai ,&nbsp;Yining Lian ,&nbsp;Yanlong Wen","doi":"10.1016/j.bdr.2022.100350","DOIUrl":null,"url":null,"abstract":"<div><p><span><span><span>Subgraph matching aims to find similar substructures in a single graph according to a given query graph and is known as a basic query for graph data management. There exist many categories of subgraph matching solutions. Subgraph isomorphism, which is thought of an NP-complete problem, is an initial solution for the subgraph matching task. To speed up the procedure, graph simulation has been presented to match subgraphs with a </span>polynomial complexity of time. Unfortunately, graph simulation usually loses topologies of matched subgraphs because of its loose restrictions. In this paper, we propose an </span>approximation approach named kSGM (top-</span><strong>k S</strong>ubraph <strong>G</strong>raph <strong>M</strong>atching) for subgraph matching based on twig patterns. First, we transform query graphs into twig patterns and match candidate substructures in graph data. Second, we present an optimized join strategy along with top-k mechanism, including join order selection based on cost evaluation and optimized pruning based on maximum/minimum possible score. Finally, we design experiments on real-life and synthetic graph data to evaluate the performance of our work. The results show that our proposed kSGM obviously reduces the time complexity and guarantee the correctness for answering the queries of subgraph matching compared to existing algorithms.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"30 ","pages":"Article 100350"},"PeriodicalIF":3.5000,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Big Data Research","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2214579622000442","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Subgraph matching aims to find similar substructures in a single graph according to a given query graph and is known as a basic query for graph data management. There exist many categories of subgraph matching solutions. Subgraph isomorphism, which is thought of an NP-complete problem, is an initial solution for the subgraph matching task. To speed up the procedure, graph simulation has been presented to match subgraphs with a polynomial complexity of time. Unfortunately, graph simulation usually loses topologies of matched subgraphs because of its loose restrictions. In this paper, we propose an approximation approach named kSGM (top-k Subraph Graph Matching) for subgraph matching based on twig patterns. First, we transform query graphs into twig patterns and match candidate substructures in graph data. Second, we present an optimized join strategy along with top-k mechanism, including join order selection based on cost evaluation and optimized pruning based on maximum/minimum possible score. Finally, we design experiments on real-life and synthetic graph data to evaluate the performance of our work. The results show that our proposed kSGM obviously reduces the time complexity and guarantee the correctness for answering the queries of subgraph matching compared to existing algorithms.

基于小枝的大规模图数据Top-k子图匹配算法
子图匹配的目的是根据给定的查询图在单个图中找到相似的子结构,是图数据管理的基本查询。子图匹配解有很多种类型。子图同构是子图匹配问题的初始解,被认为是np完全问题。为了加快子图匹配的速度,提出了以多项式时间复杂度匹配子图的图模拟方法。不幸的是,图模拟由于其宽松的限制,通常会丢失匹配子图的拓扑结构。本文提出了一种基于小枝模式的子图匹配的近似方法kSGM (top-k subgraph Matching)。首先,我们将查询图转换为小枝模式,并在图数据中匹配候选子结构。其次,我们提出了一种基于top-k机制的优化连接策略,包括基于成本评估的连接顺序选择和基于最大/最小可能分数的优化修剪。最后,我们设计了真实和合成图形数据的实验来评估我们工作的性能。结果表明,与现有算法相比,我们提出的kSGM算法明显降低了时间复杂度,保证了回答子图匹配查询的正确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Big Data Research
Big Data Research Computer Science-Computer Science Applications
CiteScore
8.40
自引率
3.00%
发文量
0
期刊介绍: The journal aims to promote and communicate advances in big data research by providing a fast and high quality forum for researchers, practitioners and policy makers from the very many different communities working on, and with, this topic. The journal will accept papers on foundational aspects in dealing with big data, as well as papers on specific Platforms and Technologies used to deal with big data. To promote Data Science and interdisciplinary collaboration between fields, and to showcase the benefits of data driven research, papers demonstrating applications of big data in domains as diverse as Geoscience, Social Web, Finance, e-Commerce, Health Care, Environment and Climate, Physics and Astronomy, Chemistry, life sciences and drug discovery, digital libraries and scientific publications, security and government will also be considered. Occasionally the journal may publish whitepapers on policies, standards and best practices.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信