Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures最新文献_第2页

Achieving Sublinear Complexity under Constant T in T-interval Dynamic Networks 在T区间动态网络中实现常数T下的亚线性复杂度

Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures Pub Date : 2022-07-11 DOI: 10.1145/3490148.3538571

Ruomu Hou, Irvan Jahja, Yucheng Sun, Jiyan Wu, Haifeng Yu

引用次数: 0

Contention Resolution for Coded Radio Networks 编码无线网络的争用解决

Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures Pub Date : 2022-07-11 DOI: 10.1145/3490148.3538573

M. A. Bender, Seth Gilbert, F. Kuhn, John Kuszmaul, M. Médard

{"title":"Contention Resolution for Coded Radio Networks","authors":"M. A. Bender, Seth Gilbert, F. Kuhn, John Kuszmaul, M. Médard","doi":"10.1145/3490148.3538573","DOIUrl":"https://doi.org/10.1145/3490148.3538573","url":null,"abstract":"Randomized backoff protocols, such as exponential backoff, are a powerful tool for managing access to a shared resource, often a wireless communication channel (e.g., [1]). For a wireless device to transmit successfully, it uses a backoff protocol to ensure exclusive access to the channel. Modern radios, however, do not need exclusive access to the channel to communicate; in particular, they have the ability to receive useful information even when more than one device transmits at the same time. These capabilities have now been exploited for many years by systems that rely on interference cancellation, physical layer network coding and analog network coding to improve efficiency. For example, Zigzag decoding [56] demonstrated how a base station can decode messages sent by multiple devices simultaneously. In this paper, we address the following question: Can we design a backoff protocol that is better than exponential backoff when exclusive channel access is not required. We define the Coded Radio Network Model, which generalizes traditional radio network models (e.g., [30]). We then introduce the Decodable Backoff Algorithm, a randomized backoff protocol that achieves an optimal throughput of 1 - o (1). (Throughput 1 is optimal, as simultaneous reception does not increase the channel capacity.) The algorithm breaks the constant throughput lower bound for traditional radio networks [47-49], showing the power of these new hardware capabilities.","PeriodicalId":112865,"journal":{"name":"Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133664904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Parallel Shortest Paths with Negative Edge Weights 具有负边权的平行最短路径

Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures Pub Date : 2022-07-11 DOI: 10.1145/3490148.3538583

Nairen Cao, Jeremy T. Fineman, Katina Russell

引用次数: 1

Average Awake Complexity of MIS and Matching MIS的平均唤醒复杂度与匹配

Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures Pub Date : 2022-07-11 DOI: 10.1145/3490148.3538566

M. Ghaffari, Julian Portmann

{"title":"Average Awake Complexity of MIS and Matching","authors":"M. Ghaffari, Julian Portmann","doi":"10.1145/3490148.3538566","DOIUrl":"https://doi.org/10.1145/3490148.3538566","url":null,"abstract":"Chatterjee, Gmyr, and Pandurangan [PODC 2020] recently introduced the notion of awake complexity for distributed algorithms, which measures the number of rounds in which a node is awake. In the other rounds, the node is sleeping and performs no computation or communication. Measuring the number of awake rounds can be of significance in many settings of distributed computing, e.g., in sensor networks where energy consumption is of concern. In that paper, Chatterjee et al. provide an elegant randomized algorithm for the Maximal Independent Set (MIS) problem that achieves an O(1) node-averaged awake complexity. That is, the average awake time among the nodes is O(1) rounds. However, to achieve that, the algorithm sacrifices the more standard round complexity measure from the well-known O(łog n) bound of MIS, due to Luby [STOC'85], to O(łog^3.41 n) rounds. Our first contribution is to present a simple randomized distributed MIS algorithm that, with high probability, has O(1) node-averaged awake complexity and O(łog n) worst-case round complexity. Our second, and more technical contribution, is to show algorithms with the same O(1) node-averaged awake complexity and O(łog n) worst-case round complexity for 1+ε approximation of maximum matching and 2+ε approximation of minimum vertex cover, where ε denotes an arbitrary small positive constant.","PeriodicalId":112865,"journal":{"name":"Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures","volume":"15 12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125761526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

A NUMA-Aware Recoverable Mutex Lock numa感知的可恢复互斥锁

Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures Pub Date : 2022-07-11 DOI: 10.1145/3490148.3538594

Ahmed I. Fahmy, W. Golab

{"title":"A NUMA-Aware Recoverable Mutex Lock","authors":"Ahmed I. Fahmy, W. Golab","doi":"10.1145/3490148.3538594","DOIUrl":"https://doi.org/10.1145/3490148.3538594","url":null,"abstract":"The mutual exclusion (ME) problem has been of interest to the scientific community since it was first defined by Dijkstra. Various algorithms have been developed to solve the problem, like the MCS and CLH queue-based locks. The problem was generalized into the recoverable mutual exclusion (RME) problem by Golab and Ramaraju to accommodate the possibility of process crash failures. Since then, multiple RME algorithms have been presented in the literature that vary in design and performance. Furthermore, non-uniform memory access (NUMA) architecture has become mainstream in designing modern distributed systems, stimulating the development of NUMA-aware mutex locks. None of the existing NUMA-aware mutex locks are recoverable to the best of our knowledge. In addition, none of the transformation techniques in the literature, such as flat-combining and cohort-locking, is a black-box transformation. Precisely, each of the existing transformation techniques requires specific characteristics of, and possible modifications to, the underlying NUMA-oblivious lock. In this work, we propose the Recoverable Filter (RF) lock, a black-box transformation approach that exploits memory locality to transform a NUMA-oblivious recoverable mutex lock into a NUMA-aware one. Practical experiments are conducted using two existing RME algorithms, Golab and Hendler's (GH) and Jayanti, Jayanti, and Joshi's (JJJ). The two RME locks are transformed into NUMA-aware locks using the proposed RF and the existing cohort algorithms. Results show that, in multi-socket configurations, our transformation boosts the performance of the NUMA-oblivious RME locks by up to 45%. The RME locks transformed using the proposed RF lock are slower than their non-recoverable cohort variants by up to 9%. Outcomes demonstrate that the overhead of our algorithm is minimal when using a single socket. Moreover, a deeper empirical assessment shows that the gap in performance between GH and JJJ is due to the entry section of JJJ, not its exit section.","PeriodicalId":112865,"journal":{"name":"Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133109878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Parallel Cover Trees and their Applications 平行覆盖树及其应用

Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures Pub Date : 2022-07-11 DOI: 10.1145/3490148.3538581

Yan Gu, Zachary Napier, Yihan Sun, Letong Wang

{"title":"Parallel Cover Trees and their Applications","authors":"Yan Gu, Zachary Napier, Yihan Sun, Letong Wang","doi":"10.1145/3490148.3538581","DOIUrl":"https://doi.org/10.1145/3490148.3538581","url":null,"abstract":"The cover tree is the canonical data structure that efficiently maintains a dynamic set of points on a metric space and supports nearest and k-nearest neighbor searches. For most real-world datasets with reasonable distributions (constant expansion rate and bounded aspect ratio mathematically), single-point insertion, single-point deletion, and nearest neighbor search (NNS) only cost logarithmically to the size of the point set. Unfortunately, due to the complication and the use of depth-first traversal order in the cover tree algorithms, we were unaware of any parallel approaches for these cover tree algorithms. This paper shows highly parallel and work-efficient cover tree algorithms that can handle batch insertions (and thus construction) and batch deletions. Assuming constant expansion rate and bounded aspect ratio, inserting or deleting m points into a cover tree with n points takes O(m log n) expected work and polylogarithmic span with high probability. Our algorithms rely on some novel algorithmic insights. We model the insertion and deletion process as a graph and use a maximal independent set (MIS) to generate tree nodes without conflicts. We use three key ideas to guarantee work-efficiency: the prefix-doubling scheme, a careful design to limit the graph size on which we apply MIS, and a strategy to propagate information among different levels in the cover tree. We also use path-copying to make our parallel cover tree a persistent data structure, which is useful in several applications. Using our parallel cover trees, we show work-efficient (or near-work-efficient) and highly parallel solutions for a list of problems in computational geometry and machine learning, including Euclidean minimum spanning tree (EMST), single-linkage clustering, bichromatic closest pair (BCP), density-based clustering and its hierarchical version, and others. To the best of our knowledge, many of them are the first solutions to achieve work-efficiency and polylogarithmic span assuming constant expansion rate and bounded aspect ratio.","PeriodicalId":112865,"journal":{"name":"Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures","volume":"377 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115173435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Preparing for Disaster: Leveraging Precomputation to Efficiently Repair Graph Structures Upon Failures 灾难准备:利用预计算在故障时有效修复图结构

Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures Pub Date : 2022-07-11 DOI: 10.1145/3490148.3538564

Calvin C. Newport, N. Vaidya, A. Weaver

引用次数: 0

Brief Announcement: Faster Stencil Computations using Gaussian Approximations 简要公告:使用高斯近似更快的模板计算

Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures Pub Date : 2022-07-11 DOI: 10.1145/3490148.3538558

Zafar Ahmad, R. Chowdhury, Rathish Das, P. Ganapathi, Aaron Gregory, Yimin Zhu

引用次数: 1

A Fully-Distributed Scalable Peer-to-Peer Protocol for Byzantine-Resilient Distributed Hash Tables 拜占庭弹性分布式哈希表的完全分布式可扩展对等协议

Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures Pub Date : 2022-07-11 DOI: 10.1145/3490148.3538588

John E. Augustine, Soumyottam Chatterjee, Gopal Pandurangan

引用次数: 4

Performance Analysis and Modelling of Concurrent Multi-access Data Structures 并发多访问数据结构的性能分析与建模

Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures Pub Date : 2022-07-11 DOI: 10.1145/3490148.3538578

A. Rukundo, A. Atalar, P. Tsigas

{"title":"Performance Analysis and Modelling of Concurrent Multi-access Data Structures","authors":"A. Rukundo, A. Atalar, P. Tsigas","doi":"10.1145/3490148.3538578","DOIUrl":"https://doi.org/10.1145/3490148.3538578","url":null,"abstract":"The major impediment to scaling concurrent data structures is memory contention when accessing shared data structure access-points, leading to thread serialisation, hindering parallelism. Aiming to address this challenge, significant amount of work in the literature has proposed multi-access techniques that improve concurrent data structure parallelism. However, there is little work on analysing and modelling the execution behaviour of concurrent multi-access data structures especially in a shared memory setting. In this paper, we analyse and model the general execution behaviour of concurrent multi-access data structures in the shared memory setting. We study and analyse the behaviour of the two popular random access patterns: shared (Remote) and exclusive (Local) access, and the behaviour of the two most commonly used atomic primitives for designing lock-free data structures: Compare and Swap, and, Fetch and Add. We model the concurrent multi-accesses by splitting the thread execution procedure into five logical sessions: i) side-work, ii) access-point search iii) access-point acquisition, iv) access-point data acquisition and v) access-point data operation. We model the acquisition of an access-point, as a system of closed queuing networks with parallel servers, and data acquisition in terms of where the data is located within the memory system. We evaluate our model on a set of concurrent data structure designs including a counter, a stack and a FIFO queue. The evaluation is carried out on two state of the art multi-core processors: Intel Xeon Phi CPU 7290 with 72 physical cores and Intel Xeon E5-2695 with 14 physical cores. Our model is able to predict the throughput performance of the given concurrent data structures with 80% to 100% accuracy on both architectures.","PeriodicalId":112865,"journal":{"name":"Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures","volume":"207 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132106609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0