Efficient Algorithms for Mining Significant Substructures in Graphs with Quality Guarantees

Huahai He, Ambuj K. Singh
{"title":"Efficient Algorithms for Mining Significant Substructures in Graphs with Quality Guarantees","authors":"Huahai He, Ambuj K. Singh","doi":"10.1109/ICDM.2007.11","DOIUrl":null,"url":null,"abstract":"Graphs have become popular for modeling scientific data in recent years. As a result, techniques for mining graphs are extremely important for understanding inherent data and domain characteristics. One such exploratory mining paradigm is the k-MST (minimum spanning tree over k vertices) problem that can be used to discover significant local substructures. In this paper, we present an efficient approximation algorithm for the k-MST problem in large graphs. The algorithm has an O(radic/k) approximation ratio and O(n log n + in log m log k + nk2 log k) running time, where n and m are the number of vertices and edges respectively. Experimental results on synthetic graphs and protein interaction networks show that the algorithm is scalable to large graphs and useful for discovering biological pathways. The highlight of the algorithm is that it offers both analytical guarantees and empirical evidence of good running time and quality.","PeriodicalId":233758,"journal":{"name":"Seventh IEEE International Conference on Data Mining (ICDM 2007)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Seventh IEEE International Conference on Data Mining (ICDM 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM.2007.11","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 21

Abstract

Graphs have become popular for modeling scientific data in recent years. As a result, techniques for mining graphs are extremely important for understanding inherent data and domain characteristics. One such exploratory mining paradigm is the k-MST (minimum spanning tree over k vertices) problem that can be used to discover significant local substructures. In this paper, we present an efficient approximation algorithm for the k-MST problem in large graphs. The algorithm has an O(radic/k) approximation ratio and O(n log n + in log m log k + nk2 log k) running time, where n and m are the number of vertices and edges respectively. Experimental results on synthetic graphs and protein interaction networks show that the algorithm is scalable to large graphs and useful for discovering biological pathways. The highlight of the algorithm is that it offers both analytical guarantees and empirical evidence of good running time and quality.
具有质量保证的图中重要子结构的高效挖掘算法
近年来,图表已经成为科学数据建模的流行工具。因此,挖掘图的技术对于理解固有数据和领域特征是极其重要的。一个这样的探索性挖掘范例是k- mst (k个顶点上的最小生成树)问题,它可以用来发现重要的局部子结构。本文提出了一种求解大图k-MST问题的有效逼近算法。算法的近似比为O(radic/k),运行时间为O(n log n + in log m log k + nk2 log k),其中n为顶点数,m为边数。在合成图和蛋白质相互作用网络上的实验结果表明,该算法可扩展到大图,并可用于发现生物通路。该算法的亮点在于它提供了良好运行时间和质量的分析保证和经验证据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信