Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing最新文献

筛选
英文 中文
Provably Fast and Space-Efficient Parallel Biconnectivity (Abstract) 可证明快速且空间高效的并行双连接(摘要)
Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing Pub Date : 2023-07-18 DOI: 10.1145/3597635.3598018
Xiaojun Dong, Letong Wang, Yan Gu, Yihan Sun
{"title":"Provably Fast and Space-Efficient Parallel Biconnectivity (Abstract)","authors":"Xiaojun Dong, Letong Wang, Yan Gu, Yihan Sun","doi":"10.1145/3597635.3598018","DOIUrl":"https://doi.org/10.1145/3597635.3598018","url":null,"abstract":"We propose the first parallel biconnectivity algorithm (FAST-BCC) that has optimal work, polylogarithmic span, and is space-efficient. Our algorithm creates a skeleton graph based on any spanning tree of the input graph. Then we use the connectivity information of the skeleton to compute the biconnectivity of the original input. We carefully analyze the correctness of our algorithm. We implemented FAST-BCC and compared it with existing implementations, including GBBS, Slota and Madduri's algorithm, and the sequential Hopcroft-Tarjan algorithm. We tested them on a 96-core machine on 27 graphs with varying edge distributions. FAST-BCC is faster than all existing baselines on each graph.","PeriodicalId":185981,"journal":{"name":"Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128526751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Empirical Challenge for NC Theory (Abstract) NC理论的经验挑战(摘要)
Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing Pub Date : 2023-07-18 DOI: 10.1145/3597635.3598020
Ananth Hari, U. Vishkin
{"title":"Empirical Challenge for NC Theory (Abstract)","authors":"Ananth Hari, U. Vishkin","doi":"10.1145/3597635.3598020","DOIUrl":"https://doi.org/10.1145/3597635.3598020","url":null,"abstract":"Horn-satisfiability or Horn-SAT is the problem of deciding whether a satisfying assignment exists for a Horn formula, a conjunction of clauses each with at most one positive literal (also known as Horn clauses). It is a well-known P-complete problem, which implies that unless P = NC, it is a hard problem to parallelize. In this paper, we empirically show that, under a known simple random model for generating the Horn formula, the ratio of hard-to-parallelize instances (closer to the worst-case behavior) is infinitesimally small. We show that the depth of a parallel algorithm for Horn-SAT is polylogarithmic on average, for almost all instances, while keeping the work linear. This challenges theoreticians and programmers to look beyond worst-case analysis and come up with practical algorithms coupled with respective performance guarantees.","PeriodicalId":185981,"journal":{"name":"Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123963023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Smarter Atomic Smart Pointers: Safe and Efficient Concurrent Memory Management (Abstract) 更智能的原子智能指针:安全高效的并发内存管理(摘要)
Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing Pub Date : 2023-07-18 DOI: 10.1145/3597635.3598027
Daniel Anderson, G. Blelloch, Yuanhao Wei
{"title":"Smarter Atomic Smart Pointers: Safe and Efficient Concurrent Memory Management (Abstract)","authors":"Daniel Anderson, G. Blelloch, Yuanhao Wei","doi":"10.1145/3597635.3598027","DOIUrl":"https://doi.org/10.1145/3597635.3598027","url":null,"abstract":"We present a technique for concurrent memory management that combines the ease-of-use of automatic memory reclamation, and the efficiency of state-of-the-art deferred reclamation algorithms. First, we combine ideas from referencing counting and hazard pointers in a novel way to implement automatic concurrent reference counting with wait-free, constant-time overhead. Second, we generalize our previous algorithm to obtain a method for converting any standard manual SMR technique into an automatic reference counting technique with a similar performance profile. We have implemented the approach as a C++ library and compared it experimentally to existing atomic reference-counting libraries and state-of-the-art manual techniques. Our results indicate that our technique is faster than existing reference-counting implementations, and competitive with manual memory reclamation techniques. More importantly, it is significantly safer than manual techniques since objects are reclaimed automatically.","PeriodicalId":185981,"journal":{"name":"Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing","volume":"156 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114799277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient Construction of Directed Hopsets and Parallel Single-source Shortest Paths (Abstract) 有向hopset和并行单源最短路径的高效构造(摘要)
Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing Pub Date : 2023-07-18 DOI: 10.1145/3597635.3598019
Nairen Cao, Jeremy T. Fineman, Katina Russell
{"title":"Efficient Construction of Directed Hopsets and Parallel Single-source Shortest Paths (Abstract)","authors":"Nairen Cao, Jeremy T. Fineman, Katina Russell","doi":"10.1145/3597635.3598019","DOIUrl":"https://doi.org/10.1145/3597635.3598019","url":null,"abstract":"The single-source shortest-path problem is as follows: given a graph with nonnegative edge weights and a designated source vertex s, return the distances from~s to each other vertex such. This paper presents a randomized parallel single-source shortest paths (SSSP) algorithm for directed graphs with non-negative integer edge weights that solves the exact SSSP problem in O (m) work and n^1/2+o(1) span, with high probability. All previous exact SSSP algorithms with nearly linear work have linear span, even for undirected unweighted graphs. To solve exact SSSP problem, we first show a deterministic reduction from exact SSSP to directed hopsets using the iterative gradual rounding technique. An (β, ε)-hopset is a set of weighted edges, also known as shortcuts, that when added to the graph, admit β-hop paths with weights no more than (1 + ε) times the true shortest path distances. We show that (β, ε)-hopsets can be used to solve the exact SSSP problem in O (m) work and O (β) span. Furthermore, we present the first nearly linear-work algorithm for constructing hopsets on directed graphs. Our sequential algorithm runs in O (m) time and constructs a hopset with O (n) edges and β = n^1/2+o(1) . We also provide a parallel version of the algorithm with O (m) work and n^1/2+o(1) span. The directed hopsets can be used to solve approximate SSSP problems efficiently, where the objective is to return estimates of the distances from the source vertex to every other vertex such that the estimate falls between the true distance and (1+ε) times the distance. Specifically, for constant ε and graphs with polynomially-bounded real edge weights, there is an algorithm solving approximate SSSP problem with O (m) work and n^1/2+o(1) span.","PeriodicalId":185981,"journal":{"name":"Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123777515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Taming Misaligned Graph Traversals in Concurrent Graph Processing (Abstract) 在并发图处理中驯服不对齐的图遍历(摘要)
Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing Pub Date : 2023-07-18 DOI: 10.1145/3597635.3598028
Xizhe Yin, Zhijia Zhao, Rajiv Gupta
{"title":"Taming Misaligned Graph Traversals in Concurrent Graph Processing (Abstract)","authors":"Xizhe Yin, Zhijia Zhao, Rajiv Gupta","doi":"10.1145/3597635.3598028","DOIUrl":"https://doi.org/10.1145/3597635.3598028","url":null,"abstract":"This work introduces Glign, a runtime system that automatically aligns the graph traversals for concurrent queries. Glign introduces three levels of graph traversal alignment for iterative evaluation of concurrent queries. First, it synchronizes the accesses of different queries to the active parts of the graph within each iteration of the evaluation---intra-iteration alignment. On top of that, Glign leverages a key insight regarding the \"heavy iterations\" in query evaluation to achieveinter-iteration alignment andalignment-aware batching. The former aligns the iterations of different queries to increase the graph access sharing, while the latter tries to group queries of better graph access sharing into the same evaluation batch. Together, these alignment techniques can substantially boost the data locality of concurrent query evaluation. Based on our experiments, Glign outperforms the state-of-the-art concurrent graph processing systems Krill and GraphM by 3.6× and 4.7× on average, respectively.","PeriodicalId":185981,"journal":{"name":"Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122116978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Parallel Strong Connectivity Based on Faster Reachability (Abstract) 基于更快可达性的并行强连接(摘要)
Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing Pub Date : 2023-07-18 DOI: 10.1145/3597635.3598017
Letong Wang, Xiaojun Dong, Yan Gu, Yihan Sun
{"title":"Parallel Strong Connectivity Based on Faster Reachability (Abstract)","authors":"Letong Wang, Xiaojun Dong, Yan Gu, Yihan Sun","doi":"10.1145/3597635.3598017","DOIUrl":"https://doi.org/10.1145/3597635.3598017","url":null,"abstract":"In this paper, we propose a parallel strongly connected components (SCC) implementation that is efficient on a wide range of graphs. Our speedup comes from two novel techniques: vertical granularity control (VGC) and parallel hash bag.","PeriodicalId":185981,"journal":{"name":"Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133501065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Fast Parallel Algorithms for Euclidean Minimum Spanning Tree and Hierarchical Spatial Clustering (Abstract) 欧几里得最小生成树与分层空间聚类的快速并行算法(摘要)
Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing Pub Date : 2023-07-18 DOI: 10.1145/3597635.3598025
Yiqiu Wang, Shangdi Yu, Yan Gu, Julian Shun
{"title":"Fast Parallel Algorithms for Euclidean Minimum Spanning Tree and Hierarchical Spatial Clustering (Abstract)","authors":"Yiqiu Wang, Shangdi Yu, Yan Gu, Julian Shun","doi":"10.1145/3597635.3598025","DOIUrl":"https://doi.org/10.1145/3597635.3598025","url":null,"abstract":"This paper presents new parallel algorithms for generating Euclidean minimum spanning trees and spatial clustering hierarchies (known as HDBSCAN^*). Our approach is based on generating a well-separated pair decomposition followed by using Kruskal's minimum spanning tree algorithm and bichromatic closest pair computations. We introduce a new notion of well-separation to reduce the work and space of our algorithm for HDBSCAN^*. We also give a new parallel divide-and-conquer algorithm for computing the dendrogram and reachability plots, which are used in visualizing clusters of different scale that arise for both EMST and HDBSCAN^*. We show that our algorithms are theoretically efficient: they have work (number of operations) matching their sequential counterparts, and polylogarithmic depth (parallel time). We implement our algorithms and propose a memory optimization that requires only a subset of well-separated pairs to be computed and materialized, leading to savings in both space (up to 10x) and time (up to 8x). Our experiments on large real-world and synthetic data sets using a 48-core machine show that our fastest algorithms outperform the best serial algorithms for the problems by 11.13--55.89x, and existing parallel algorithms by at least an order of magnitude.","PeriodicalId":185981,"journal":{"name":"Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125118289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Static Prediction of Parallel Computation Graphs (Abstract) 并行计算图的静态预测(摘要)
Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing Pub Date : 2023-07-18 DOI: 10.1145/3597635.3598026
Stefan K. Muller
{"title":"Static Prediction of Parallel Computation Graphs (Abstract)","authors":"Stefan K. Muller","doi":"10.1145/3597635.3598026","DOIUrl":"https://doi.org/10.1145/3597635.3598026","url":null,"abstract":"Many results in the theory of parallel scheduling, dating back to Brent's Theorem, are expressed in terms of the parallel dependency structure of a program as represented by a Directed Acyclic Graph (DAG). In the world of parallel and concurrent program analysis, such DAG models are also used to study deadlock, data races, and priority inversions, to name just a few examples. In all of these cases, it tends to be convenient to think of the DAG as a model of the program itself-we might say, for example, that the time to run a parallel program on P processors depends on the work and span of the program's DAG. This assumes that the DAG is a static, predictable property of the program. In reality, however, a DAG typically models the runtime relationships between threads during a particular execution of a program. To obtain the DAG, one might simulate an execution (or all possible executions) using some form of cost semantics, a dynamic semantics that produces the DAG as it executes the program. In fine-grained parallel programs, such as those that result from constructs such as fork/join, spawn/sync, async/finish, and futures, these DAGs tend to be especially dynamic and dependent on the features of a particular execution. For example, a divide-and-conquer algorithm implemented using fork/join parallelism may divide a certain number of times depending on the input size, and a program written with futures can choose to wait on threads or not wait on threads depending on conditions available only at runtime. Such programs are best represented by a (possibly infinite) family of DAGs, representing all possible executions of the program.","PeriodicalId":185981,"journal":{"name":"Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130258529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CommonGraph: Graph Analytics on Evolving Data (Abstract) CommonGraph:演化数据的图分析(摘要)
Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing Pub Date : 2023-07-18 DOI: 10.1145/3597635.3598022
Mahbod Afarin, Chao Gao, Shafiur Rahman, Nael B. Abu-Ghazaleh, Rajiv Gupta
{"title":"CommonGraph: Graph Analytics on Evolving Data (Abstract)","authors":"Mahbod Afarin, Chao Gao, Shafiur Rahman, Nael B. Abu-Ghazaleh, Rajiv Gupta","doi":"10.1145/3597635.3598022","DOIUrl":"https://doi.org/10.1145/3597635.3598022","url":null,"abstract":"We consider the problem of graph analytics on evolving graphs. In this scenario, a query typically needs to be applied to different snapshots of the graph over an extended time window. We propose CommonGraph, an approach for efficient processing of queries on evolving graphs. We first observe that edge deletions are significantly more expensive than addition operations. CommonGraph converts all deletions to additions by finding a common graph that exists across all snapshots. After computing the query on this graph, to reach any snapshot, we simply need to add the missing edges and incrementally update the query results. CommonGraph also allows sharing of common additions among snapshots that require them, and breaks the sequential dependency inherent in the traditional streaming approach where snapshots are processed in sequence, enabling additional opportunities for parallelism. We incorporate the CommonGraph approach by extending the KickStarter streaming framework. CommonGraph achieves 1.38x-8.17x improvement in performance over Kickstarter across multiple benchmarks.","PeriodicalId":185981,"journal":{"name":"Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132643537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Accelerating Sparse Data Orchestration via Dynamic Reflexive Tiling (Extended Abstract) 利用动态自反平铺加速稀疏数据编排(扩展摘要)
Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing Pub Date : 2023-07-18 DOI: 10.1145/3597635.3598031
Toluwanimi O. Odemuyiwa, Hadi Asghari-Moghaddam, Michael Pellauer, Kartik Hegde, Po-An Tsai, N. Crago, A. Jaleel, J. Owens, Edgar Solomonik, J. Emer, Christopher W. Fletcher
{"title":"Accelerating Sparse Data Orchestration via Dynamic Reflexive Tiling (Extended Abstract)","authors":"Toluwanimi O. Odemuyiwa, Hadi Asghari-Moghaddam, Michael Pellauer, Kartik Hegde, Po-An Tsai, N. Crago, A. Jaleel, J. Owens, Edgar Solomonik, J. Emer, Christopher W. Fletcher","doi":"10.1145/3597635.3598031","DOIUrl":"https://doi.org/10.1145/3597635.3598031","url":null,"abstract":"Tensor algebra involving multiple sparse operands is severely memory bound, making it a challenging target for acceleration. Furthermore, irregular sparsity complicates traditional techniques---such as tiling---for ameliorating memory bottlenecks. Prior sparse tiling schemes are sparsity unaware: they carve tensors into uniform coordinate-space shapes, which leads to low-occupancy tiles and thus lower exploitable reuse. To address these challenges, this paper proposes dynamic reflexive tiling (DRT), a novel tiling method that improves data reuse over prior art for sparse tensor kernels, unlocking significant performance improvement opportunities. DRT's key idea is dynamic sparsity-aware tiling. DRT continuously re-tiles sparse tensors at runtime based on the current sparsity of the active regions of all input tensors, to maximize accelerator buffer utilization while retaining the ability to co-iterate through tiles of distinct tensors. Through an extensive evaluation over a set of SuiteSparse matrices, we show how DRT can be applied to multiple prior accelerators with different dataflows (ExTensor, OuterSPACE, MatRaptor), improving their performance (by 3.3x, 5.1x, and 1.6x, respectively) while adding negligible area overhead. We apply DRT to higher-order tensor kernels to reduce DRAM traffic by 3.9x and 16.9x over a CPU implementation and prior-art tiling scheme, respectively. Finally, we show that the technique is portable to software, with an improvement of 7.29x and 2.94x in memory overhead compared to untiled sparse-sparse matrix multiplication (SpMSpM).","PeriodicalId":185981,"journal":{"name":"Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing","volume":"95 Suppl A 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116920170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信