简短公告:大规模数据集中的图匹配

Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures Pub Date : 2017-07-24 DOI:10.1145/3087556.3087601

Soheil Behnezhad, M. Derakhshan, Hossein Esfandiari, E. Tan, Hadi Yami

{"title":"简短公告:大规模数据集中的图匹配","authors":"Soheil Behnezhad, M. Derakhshan, Hossein Esfandiari, E. Tan, Hadi Yami","doi":"10.1145/3087556.3087601","DOIUrl":null,"url":null,"abstract":"In this paper we consider the maximum matching problem in large bipartite graphs. We present a new algorithm that finds the maximum matching in a few iterations of a novel edge sampling technique. This algorithm can be implemented in big data settings such as streaming setting and MapReduce setting, where each iteration of the algorithm maps to one pass over the stream, or one MapReduce round of computation, respectively. We prove that our algorithm provides a 1-\\eps approximate solution to the maximum matching in 1/\\eps rounds which improves the prior work in terms of the number of passes/rounds. Our algorithm works even better when we run it on real datasets and finds the exact maximum matching in 4 to 8 rounds while sampling only about %1 of the total edges.","PeriodicalId":162994,"journal":{"name":"Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Brief Announcement: Graph Matching in Massive Datasets\",\"authors\":\"Soheil Behnezhad, M. Derakhshan, Hossein Esfandiari, E. Tan, Hadi Yami\",\"doi\":\"10.1145/3087556.3087601\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we consider the maximum matching problem in large bipartite graphs. We present a new algorithm that finds the maximum matching in a few iterations of a novel edge sampling technique. This algorithm can be implemented in big data settings such as streaming setting and MapReduce setting, where each iteration of the algorithm maps to one pass over the stream, or one MapReduce round of computation, respectively. We prove that our algorithm provides a 1-\\\\eps approximate solution to the maximum matching in 1/\\\\eps rounds which improves the prior work in terms of the number of passes/rounds. Our algorithm works even better when we run it on real datasets and finds the exact maximum matching in 4 to 8 rounds while sampling only about %1 of the total edges.\",\"PeriodicalId\":162994,\"journal\":{\"name\":\"Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3087556.3087601\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3087556.3087601","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 11

摘要

本文研究了大二部图的最大匹配问题。我们提出了一种新的边缘采样算法，它可以在几次迭代中找到最大匹配。该算法可以在流设置和MapReduce设置等大数据设置中实现，其中算法的每次迭代分别映射到流的一次传递或MapReduce的一轮计算。我们证明了我们的算法在1/\eps轮中提供了1-\eps的最大匹配近似解，这在通过/轮数方面改进了先前的工作。当我们在真实的数据集上运行它时，我们的算法工作得更好，并且在4到8轮中找到精确的最大匹配，而只采样大约%1的总边缘。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Brief Announcement: Graph Matching in Massive Datasets

In this paper we consider the maximum matching problem in large bipartite graphs. We present a new algorithm that finds the maximum matching in a few iterations of a novel edge sampling technique. This algorithm can be implemented in big data settings such as streaming setting and MapReduce setting, where each iteration of the algorithm maps to one pass over the stream, or one MapReduce round of computation, respectively. We prove that our algorithm provides a 1-\eps approximate solution to the maximum matching in 1/\eps rounds which improves the prior work in terms of the number of passes/rounds. Our algorithm works even better when we run it on real datasets and finds the exact maximum matching in 4 to 8 rounds while sampling only about %1 of the total edges.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures

自引率

0.00%

发文量