{"title":"有向无环图的并行深度优先搜索","authors":"M. Naumov, A. Vrielink, M. Garland","doi":"10.1145/3149704.3149764","DOIUrl":null,"url":null,"abstract":"Depth-First Search (DFS) is a pervasive algorithm, often used as a building block for topological sort, connectivity and planarity testing, among many other applications. We propose a novel work-efficient parallel algorithm for the DFS traversal of directed acyclic graph (DAG). The algorithm traverses the entire DAG in a BFS-like fashion no more than three times. As a result it finds the DFS pre-order (discovery) and post-order (finish time) as well as the parent relationship associated with every node in a DAG. We analyse the runtime and work complexity of this novel parallel algorithm. Also, we show that our algorithm is easy to implement and optimize for performance. In particular, we show that its CUDA implementation on the GPU outperforms sequential DFS on the CPU by up to 6x in our experiments.","PeriodicalId":292798,"journal":{"name":"Proceedings of the Seventh Workshop on Irregular Applications: Architectures and Algorithms","volume":"714 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Parallel Depth-First Search for Directed Acyclic Graphs\",\"authors\":\"M. Naumov, A. Vrielink, M. Garland\",\"doi\":\"10.1145/3149704.3149764\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Depth-First Search (DFS) is a pervasive algorithm, often used as a building block for topological sort, connectivity and planarity testing, among many other applications. We propose a novel work-efficient parallel algorithm for the DFS traversal of directed acyclic graph (DAG). The algorithm traverses the entire DAG in a BFS-like fashion no more than three times. As a result it finds the DFS pre-order (discovery) and post-order (finish time) as well as the parent relationship associated with every node in a DAG. We analyse the runtime and work complexity of this novel parallel algorithm. Also, we show that our algorithm is easy to implement and optimize for performance. In particular, we show that its CUDA implementation on the GPU outperforms sequential DFS on the CPU by up to 6x in our experiments.\",\"PeriodicalId\":292798,\"journal\":{\"name\":\"Proceedings of the Seventh Workshop on Irregular Applications: Architectures and Algorithms\",\"volume\":\"714 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Seventh Workshop on Irregular Applications: Architectures and Algorithms\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3149704.3149764\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Seventh Workshop on Irregular Applications: Architectures and Algorithms","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3149704.3149764","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14
摘要
深度优先搜索(deep - first Search, DFS)是一种普遍的算法,通常用作拓扑排序、连通性和平面性测试以及许多其他应用程序的构建块。针对有向无环图(DAG)的DFS遍历问题,提出了一种新的高效并行算法。该算法以类似bfs的方式遍历整个DAG不超过三次。因此,它可以找到DFS的预顺序(发现)和后顺序(完成时间),以及与DAG中每个节点相关联的父关系。分析了该算法的运行时间和工作复杂度。此外,我们还证明了我们的算法易于实现和优化性能。特别是,在我们的实验中,我们表明它在GPU上的CUDA实现比CPU上的顺序DFS性能高出6倍。
Parallel Depth-First Search for Directed Acyclic Graphs
Depth-First Search (DFS) is a pervasive algorithm, often used as a building block for topological sort, connectivity and planarity testing, among many other applications. We propose a novel work-efficient parallel algorithm for the DFS traversal of directed acyclic graph (DAG). The algorithm traverses the entire DAG in a BFS-like fashion no more than three times. As a result it finds the DFS pre-order (discovery) and post-order (finish time) as well as the parent relationship associated with every node in a DAG. We analyse the runtime and work complexity of this novel parallel algorithm. Also, we show that our algorithm is easy to implement and optimize for performance. In particular, we show that its CUDA implementation on the GPU outperforms sequential DFS on the CPU by up to 6x in our experiments.