Soheil Behnezhad, M. Derakhshan, Hossein Esfandiari, E. Tan, Hadi Yami
{"title":"简短公告:大规模数据集中的图匹配","authors":"Soheil Behnezhad, M. Derakhshan, Hossein Esfandiari, E. Tan, Hadi Yami","doi":"10.1145/3087556.3087601","DOIUrl":null,"url":null,"abstract":"In this paper we consider the maximum matching problem in large bipartite graphs. We present a new algorithm that finds the maximum matching in a few iterations of a novel edge sampling technique. This algorithm can be implemented in big data settings such as streaming setting and MapReduce setting, where each iteration of the algorithm maps to one pass over the stream, or one MapReduce round of computation, respectively. We prove that our algorithm provides a 1-\\eps approximate solution to the maximum matching in 1/\\eps rounds which improves the prior work in terms of the number of passes/rounds. Our algorithm works even better when we run it on real datasets and finds the exact maximum matching in 4 to 8 rounds while sampling only about %1 of the total edges.","PeriodicalId":162994,"journal":{"name":"Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Brief Announcement: Graph Matching in Massive Datasets\",\"authors\":\"Soheil Behnezhad, M. Derakhshan, Hossein Esfandiari, E. Tan, Hadi Yami\",\"doi\":\"10.1145/3087556.3087601\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we consider the maximum matching problem in large bipartite graphs. We present a new algorithm that finds the maximum matching in a few iterations of a novel edge sampling technique. This algorithm can be implemented in big data settings such as streaming setting and MapReduce setting, where each iteration of the algorithm maps to one pass over the stream, or one MapReduce round of computation, respectively. We prove that our algorithm provides a 1-\\\\eps approximate solution to the maximum matching in 1/\\\\eps rounds which improves the prior work in terms of the number of passes/rounds. Our algorithm works even better when we run it on real datasets and finds the exact maximum matching in 4 to 8 rounds while sampling only about %1 of the total edges.\",\"PeriodicalId\":162994,\"journal\":{\"name\":\"Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3087556.3087601\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3087556.3087601","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Brief Announcement: Graph Matching in Massive Datasets
In this paper we consider the maximum matching problem in large bipartite graphs. We present a new algorithm that finds the maximum matching in a few iterations of a novel edge sampling technique. This algorithm can be implemented in big data settings such as streaming setting and MapReduce setting, where each iteration of the algorithm maps to one pass over the stream, or one MapReduce round of computation, respectively. We prove that our algorithm provides a 1-\eps approximate solution to the maximum matching in 1/\eps rounds which improves the prior work in terms of the number of passes/rounds. Our algorithm works even better when we run it on real datasets and finds the exact maximum matching in 4 to 8 rounds while sampling only about %1 of the total edges.