LessMine: Reducing Sample Space and Data Access for Dense Pattern Mining

Tianyu Fu, Ziqian Wan, Guohao Dai, Yu Wang, Huazhong Yang
{"title":"LessMine: Reducing Sample Space and Data Access for Dense Pattern Mining","authors":"Tianyu Fu, Ziqian Wan, Guohao Dai, Yu Wang, Huazhong Yang","doi":"10.1109/HPEC43674.2020.9286187","DOIUrl":null,"url":null,"abstract":"In the era of “big data”, graph has been proven to be one of the most important reflections of real-world problems. To refine the core properties of large-scale graphs, dense pattern mining plays a significant role. Because of the complexity of pattern mining problems, conventional implementations often lack scalability, consuming much time and memory space. Previous work (e.g., ASAP [1]) proposed approximate pattern mining as an efficient way to extract structural information from graphs. It demonstrates dramatic performance improvement by up to two orders of magnitude. However, we observe three main flaws of ASAP in cases of dense patterns, thus we propose LessMine, which reduces the sample space and data access for dense pattern mining. We introduce the reorganization of data structure, the method of concurrent sample, and uniform close. We also provide locality-aware partition for distributed settings. The evaluation shows that our design achieves up to 1829 × speedup with 66% less error rate compared with ASAP.","PeriodicalId":168544,"journal":{"name":"2020 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE High Performance Extreme Computing Conference (HPEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPEC43674.2020.9286187","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

In the era of “big data”, graph has been proven to be one of the most important reflections of real-world problems. To refine the core properties of large-scale graphs, dense pattern mining plays a significant role. Because of the complexity of pattern mining problems, conventional implementations often lack scalability, consuming much time and memory space. Previous work (e.g., ASAP [1]) proposed approximate pattern mining as an efficient way to extract structural information from graphs. It demonstrates dramatic performance improvement by up to two orders of magnitude. However, we observe three main flaws of ASAP in cases of dense patterns, thus we propose LessMine, which reduces the sample space and data access for dense pattern mining. We introduce the reorganization of data structure, the method of concurrent sample, and uniform close. We also provide locality-aware partition for distributed settings. The evaluation shows that our design achieves up to 1829 × speedup with 66% less error rate compared with ASAP.
减少密集模式挖掘的样本空间和数据访问
在“大数据”时代,图形已被证明是对现实问题最重要的反映之一。为了提炼大规模图的核心属性,密集模式挖掘起着重要的作用。由于模式挖掘问题的复杂性,传统的实现往往缺乏可伸缩性,消耗大量的时间和内存空间。先前的工作(例如ASAP[1])提出了近似模式挖掘作为从图中提取结构信息的有效方法。它显示了高达两个数量级的显著性能改进。然而,我们观察到ASAP在密集模式情况下的三个主要缺陷,因此我们提出了LessMine,它减少了密集模式挖掘的样本空间和数据访问。介绍了数据结构的重组、并发采样方法和均匀闭合方法。我们还为分布式设置提供了位置感知分区。评估表明,与ASAP相比,我们的设计实现了高达1829倍的加速,错误率降低了66%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信