十亿边图的分布2-逼近Steiner最小树

Tahsin Reza, G. Sanders, R. Pearce
{"title":"十亿边图的分布2-逼近Steiner最小树","authors":"Tahsin Reza, G. Sanders, R. Pearce","doi":"10.48550/arXiv.2205.14503","DOIUrl":null,"url":null,"abstract":"Given an edge-weighted graph and a set of known seed vertices of interest, a network scientist often desires to understand the graph relationships to explain connections between the seed vertices. If the size of the seed set is 2, shortest path calculations are an attractive computational kernel to explore the connections between the two vertices. When the seed set is 3 or larger (say up to 1,000s) Steiner minimal tree – min-weight acyclic connected subgraph (of the input graph) that contains all the seed vertices – is an attractive generalization of shortest weighted paths. In general, computing a Steiner minimal tree is NP-hard, but decades ago several polynomial-time algorithms were designed and proven to yield Steiner trees whose total weight is bounded within 2 times the minimal Steiner tree. Despite its rich theoretical literature, works related to parallel Steiner minimal tree computation and their scalable implementations are rather scarce. In this paper, we present a parallel 2-approximation Steiner minimal tree algorithm (with theoretical guarantees) and its MPI-based distributed implementation. In place of distance computation between all pairs of seed vertices, an expensive phase in many approximation algorithms, the solution we employ, exploits Voronoi cell computation. Also, this approach has higher parallel efficiency than others that involve minimum spanning tree computation on the entire graph. Furthermore, our distributed design exploits asynchronous processing and a message prioritization scheme to accelerate convergence of distance computation, employs techniques to avoid inefficient distributed spanning tree computation on the entire graph, and harnesses a combination of vertex and edge centric processing to offer fast time-to-solution. We demonstrate scalability and performance of our solution using real-world graphs with up to 128 billion edges and 512 compute nodes (8K processes), show the ability to find Steiner trees with up to 10K seed vertices in under one minute, and present in-depth analyses that highlight the benefits of our design choices. Using four real-world graphs and three seed sets for each, we compare our solution with the state-of-the-art exact Steiner minimal tree solver, SCIP-Jack, and two sequential algorithms with the same approximation bound as our algorithm. Our distributed solution comfortably outperforms these related works on graphs with 10s million edges and offers decent strong scaling – up to 90% efficient. We empirically show that, on average, the total distance (sum of edge weights) of the Steiner tree identified by our solution is 1.0527 times greater than the Steiner minimal tree (i.e., the optimal solution) – well within the theoretical bound of less than equal to 2.","PeriodicalId":321801,"journal":{"name":"2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Towards Distributed 2-Approximation Steiner Minimal Trees in Billion-edge Graphs\",\"authors\":\"Tahsin Reza, G. Sanders, R. Pearce\",\"doi\":\"10.48550/arXiv.2205.14503\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Given an edge-weighted graph and a set of known seed vertices of interest, a network scientist often desires to understand the graph relationships to explain connections between the seed vertices. If the size of the seed set is 2, shortest path calculations are an attractive computational kernel to explore the connections between the two vertices. When the seed set is 3 or larger (say up to 1,000s) Steiner minimal tree – min-weight acyclic connected subgraph (of the input graph) that contains all the seed vertices – is an attractive generalization of shortest weighted paths. In general, computing a Steiner minimal tree is NP-hard, but decades ago several polynomial-time algorithms were designed and proven to yield Steiner trees whose total weight is bounded within 2 times the minimal Steiner tree. Despite its rich theoretical literature, works related to parallel Steiner minimal tree computation and their scalable implementations are rather scarce. In this paper, we present a parallel 2-approximation Steiner minimal tree algorithm (with theoretical guarantees) and its MPI-based distributed implementation. In place of distance computation between all pairs of seed vertices, an expensive phase in many approximation algorithms, the solution we employ, exploits Voronoi cell computation. Also, this approach has higher parallel efficiency than others that involve minimum spanning tree computation on the entire graph. Furthermore, our distributed design exploits asynchronous processing and a message prioritization scheme to accelerate convergence of distance computation, employs techniques to avoid inefficient distributed spanning tree computation on the entire graph, and harnesses a combination of vertex and edge centric processing to offer fast time-to-solution. We demonstrate scalability and performance of our solution using real-world graphs with up to 128 billion edges and 512 compute nodes (8K processes), show the ability to find Steiner trees with up to 10K seed vertices in under one minute, and present in-depth analyses that highlight the benefits of our design choices. Using four real-world graphs and three seed sets for each, we compare our solution with the state-of-the-art exact Steiner minimal tree solver, SCIP-Jack, and two sequential algorithms with the same approximation bound as our algorithm. Our distributed solution comfortably outperforms these related works on graphs with 10s million edges and offers decent strong scaling – up to 90% efficient. We empirically show that, on average, the total distance (sum of edge weights) of the Steiner tree identified by our solution is 1.0527 times greater than the Steiner minimal tree (i.e., the optimal solution) – well within the theoretical bound of less than equal to 2.\",\"PeriodicalId\":321801,\"journal\":{\"name\":\"2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)\",\"volume\":\"30 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2205.14503\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2205.14503","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

给定一个边加权图和一组已知的感兴趣的种子点,网络科学家通常希望理解图的关系,以解释种子点之间的联系。如果种子集的大小为2,则最短路径计算是一个有吸引力的计算内核,用于探索两个顶点之间的连接。当种子集大于等于3(比如大于1000s)时,Steiner最小树——包含所有种子顶点的最小权重无环连通子图(输入图的)——是最短加权路径的一个很有吸引力的推广。一般来说,计算Steiner最小树是np困难的,但几十年前,人们设计并证明了几种多项式时间算法可以产生总权重在最小Steiner树的2倍以内的Steiner树。尽管有丰富的理论文献,但与并行斯坦纳最小树计算及其可扩展实现相关的工作相当少。在本文中,我们提出了一种并行的2-逼近Steiner最小树算法(具有理论保证)及其基于mpi的分布式实现。在许多近似算法中,所有对种子顶点之间的距离计算是一个昂贵的阶段,我们采用的解决方案利用了Voronoi细胞计算。此外,这种方法比其他涉及整个图上最小生成树计算的方法具有更高的并行效率。此外,我们的分布式设计利用异步处理和消息优先级方案来加速距离计算的收敛,采用技术来避免在整个图上低效的分布式生成树计算,并利用顶点和边缘中心处理的组合来提供快速的解决方案。我们使用具有多达1280亿个边和512个计算节点(8K进程)的真实图形展示了我们的解决方案的可扩展性和性能,展示了在一分钟内找到具有多达10K个种子顶点的斯坦纳树的能力,并提供了深入的分析,突出了我们设计选择的好处。使用四个真实世界的图和每个图的三个种子集,我们将我们的解决方案与最先进的精确Steiner最小树解算器SCIP-Jack和两个具有与我们的算法相同近似界的顺序算法进行比较。我们的分布式解决方案在具有1000万条边的图上的性能远远超过了这些相关工作,并提供了相当强的可扩展性——效率高达90%。我们的经验表明,平均而言,由我们的解决方案确定的斯坦纳树的总距离(边权之和)比斯坦纳最小树(即最优解)大1.0527倍-完全在小于等于2的理论范围内。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Towards Distributed 2-Approximation Steiner Minimal Trees in Billion-edge Graphs
Given an edge-weighted graph and a set of known seed vertices of interest, a network scientist often desires to understand the graph relationships to explain connections between the seed vertices. If the size of the seed set is 2, shortest path calculations are an attractive computational kernel to explore the connections between the two vertices. When the seed set is 3 or larger (say up to 1,000s) Steiner minimal tree – min-weight acyclic connected subgraph (of the input graph) that contains all the seed vertices – is an attractive generalization of shortest weighted paths. In general, computing a Steiner minimal tree is NP-hard, but decades ago several polynomial-time algorithms were designed and proven to yield Steiner trees whose total weight is bounded within 2 times the minimal Steiner tree. Despite its rich theoretical literature, works related to parallel Steiner minimal tree computation and their scalable implementations are rather scarce. In this paper, we present a parallel 2-approximation Steiner minimal tree algorithm (with theoretical guarantees) and its MPI-based distributed implementation. In place of distance computation between all pairs of seed vertices, an expensive phase in many approximation algorithms, the solution we employ, exploits Voronoi cell computation. Also, this approach has higher parallel efficiency than others that involve minimum spanning tree computation on the entire graph. Furthermore, our distributed design exploits asynchronous processing and a message prioritization scheme to accelerate convergence of distance computation, employs techniques to avoid inefficient distributed spanning tree computation on the entire graph, and harnesses a combination of vertex and edge centric processing to offer fast time-to-solution. We demonstrate scalability and performance of our solution using real-world graphs with up to 128 billion edges and 512 compute nodes (8K processes), show the ability to find Steiner trees with up to 10K seed vertices in under one minute, and present in-depth analyses that highlight the benefits of our design choices. Using four real-world graphs and three seed sets for each, we compare our solution with the state-of-the-art exact Steiner minimal tree solver, SCIP-Jack, and two sequential algorithms with the same approximation bound as our algorithm. Our distributed solution comfortably outperforms these related works on graphs with 10s million edges and offers decent strong scaling – up to 90% efficient. We empirically show that, on average, the total distance (sum of edge weights) of the Steiner tree identified by our solution is 1.0527 times greater than the Steiner minimal tree (i.e., the optimal solution) – well within the theoretical bound of less than equal to 2.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信