分支算法大规模并行化的轻量级半集中式策略

IF 2 4区计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS

Parallel Computing Pub Date : 2023-07-01 DOI:10.1016/j.parco.2023.103024

Andres Pastrana-Cruz, Manuel Lafond

{"title":"分支算法大规模并行化的轻量级半集中式策略","authors":"Andres Pastrana-Cruz, Manuel Lafond","doi":"10.1016/j.parco.2023.103024","DOIUrl":null,"url":null,"abstract":"<div><p>Several NP-hard problems are solved exactly using exponential-time branching strategies, whether it be branch-and-bound algorithms, or bounded search trees in fixed-parameter algorithms. The number of tractable instances that can be handled by sequential algorithms is usually small, whereas massive parallelization has been shown to significantly increase the space of instances that can be solved exactly. However, previous centralized approaches require too much communication to be efficient, whereas decentralized approaches are more efficient but have difficulty keeping track of the global state of the exploration.</p><p>In this work, we propose to revisit the centralized paradigm while avoiding previous bottlenecks. In our strategy, the center has lightweight responsibilities, requires only a few bits for every communication, but is still able to keep track of the progress of every worker. In particular, the center never holds any task but is able to guarantee that a process with no work always receives the highest priority task globally.</p><p>Our strategy was implemented in a generic C++ library called GemPBA, which allows a programmer to convert a sequential branching algorithm into a parallel version by changing only a few lines of code. An experimental case study on the vertex cover problem demonstrates that some of the toughest instances from the DIMACS challenge graphs that would take months to solve sequentially can be handled within two hours with our approach.</p></div>","PeriodicalId":54642,"journal":{"name":"Parallel Computing","volume":"116 ","pages":"Article 103024"},"PeriodicalIF":2.0000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A lightweight semi-centralized strategy for the massive parallelization of branching algorithms\",\"authors\":\"Andres Pastrana-Cruz, Manuel Lafond\",\"doi\":\"10.1016/j.parco.2023.103024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Several NP-hard problems are solved exactly using exponential-time branching strategies, whether it be branch-and-bound algorithms, or bounded search trees in fixed-parameter algorithms. The number of tractable instances that can be handled by sequential algorithms is usually small, whereas massive parallelization has been shown to significantly increase the space of instances that can be solved exactly. However, previous centralized approaches require too much communication to be efficient, whereas decentralized approaches are more efficient but have difficulty keeping track of the global state of the exploration.</p><p>In this work, we propose to revisit the centralized paradigm while avoiding previous bottlenecks. In our strategy, the center has lightweight responsibilities, requires only a few bits for every communication, but is still able to keep track of the progress of every worker. In particular, the center never holds any task but is able to guarantee that a process with no work always receives the highest priority task globally.</p><p>Our strategy was implemented in a generic C++ library called GemPBA, which allows a programmer to convert a sequential branching algorithm into a parallel version by changing only a few lines of code. An experimental case study on the vertex cover problem demonstrates that some of the toughest instances from the DIMACS challenge graphs that would take months to solve sequentially can be handled within two hours with our approach.</p></div>\",\"PeriodicalId\":54642,\"journal\":{\"name\":\"Parallel Computing\",\"volume\":\"116 \",\"pages\":\"Article 103024\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2023-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Parallel Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167819123000303\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Parallel Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167819123000303","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

摘要

无论是分支定界算法，还是固定参数算法中的有界搜索树，都可以使用指数时间分支策略精确地解决一些NP难题。可以由顺序算法处理的可处理实例的数量通常很小，而大规模并行化已被证明可以显著增加可以精确求解的实例的空间。然而，以前的集中方法需要太多的沟通才能有效，而分散方法更有效，但很难跟踪全球勘探状况。在这项工作中，我们建议重新审视集中式范式，同时避免以前的瓶颈。在我们的战略中，该中心的职责很轻，每次通信只需要几位，但仍然能够跟踪每个工人的进度。特别是，该中心从不持有任何任务，但能够保证没有工作的流程始终在全球范围内接收最高优先级的任务。我们的策略是在一个名为GemPBA的通用C++库中实现的，该库允许程序员通过只更改几行代码将顺序分支算法转换为并行版本。一个关于顶点覆盖问题的实验案例研究表明，使用我们的方法，可以在两小时内处理DIMACS挑战图中一些需要数月才能顺序解决的最困难的实例。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A lightweight semi-centralized strategy for the massive parallelization of branching algorithms

Several NP-hard problems are solved exactly using exponential-time branching strategies, whether it be branch-and-bound algorithms, or bounded search trees in fixed-parameter algorithms. The number of tractable instances that can be handled by sequential algorithms is usually small, whereas massive parallelization has been shown to significantly increase the space of instances that can be solved exactly. However, previous centralized approaches require too much communication to be efficient, whereas decentralized approaches are more efficient but have difficulty keeping track of the global state of the exploration.

In this work, we propose to revisit the centralized paradigm while avoiding previous bottlenecks. In our strategy, the center has lightweight responsibilities, requires only a few bits for every communication, but is still able to keep track of the progress of every worker. In particular, the center never holds any task but is able to guarantee that a process with no work always receives the highest priority task globally.

Our strategy was implemented in a generic C++ library called GemPBA, which allows a programmer to convert a sequential branching algorithm into a parallel version by changing only a few lines of code. An experimental case study on the vertex cover problem demonstrates that some of the toughest instances from the DIMACS challenge graphs that would take months to solve sequentially can be handled within two hours with our approach.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Parallel Computing 工程技术-计算机：理论方法

CiteScore

3.50

自引率

7.10%

发文量

审稿时长

4.5 months

期刊介绍： Parallel Computing is an international journal presenting the practical use of parallel computer systems, including high performance architecture, system software, programming systems and tools, and applications. Within this context the journal covers all aspects of high-end parallel computing from single homogeneous or heterogenous computing nodes to large-scale multi-node systems. Parallel Computing features original research work and review articles as well as novel or illustrative accounts of application experience with (and techniques for) the use of parallel computers. We also welcome studies reproducing prior publications that either confirm or disprove prior published results. Particular technical areas of interest include, but are not limited to: -System software for parallel computer systems including programming languages (new languages as well as compilation techniques), operating systems (including middleware), and resource management (scheduling and load-balancing). -Enabling software including debuggers, performance tools, and system and numeric libraries. -General hardware (architecture) concepts, new technologies enabling the realization of such new concepts, and details of commercially available systems -Software engineering and productivity as it relates to parallel computing -Applications (including scientific computing, deep learning, machine learning) or tool case studies demonstrating novel ways to achieve parallelism -Performance measurement results on state-of-the-art systems -Approaches to effectively utilize large-scale parallel computing including new algorithms or algorithm analysis with demonstrated relevance to real applications using existing or next generation parallel computer architectures. -Parallel I/O systems both hardware and software -Networking technology for support of high-speed computing demonstrating the impact of high-speed computation on parallel applications