ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming最新文献_第5页

21st century computer architecture 21世纪计算机体系结构

ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2014-02-06 DOI: 10.1145/2692916.2558890

M. Hill

引用次数: 13

SCCMulti: an improved parallel strongly connected components algorithm SCCMulti:一种改进的并行强连接分量算法

ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2014-02-06 DOI: 10.1145/2555243.2555286

Daniel Tomkins, Timmie G. Smith, N. Amato, Lawrence Rauchwerger

{"title":"SCCMulti: an improved parallel strongly connected components algorithm","authors":"Daniel Tomkins, Timmie G. Smith, N. Amato, Lawrence Rauchwerger","doi":"10.1145/2555243.2555286","DOIUrl":"https://doi.org/10.1145/2555243.2555286","url":null,"abstract":"Tarjan's famous linear time, sequential algorithm for finding the strongly connected components (SCCs) of a graph relies on depth first search, which is inherently sequential. Deterministic parallel algorithms solve this problem in logarithmic time using matrix multiplication techniques, but matrix multiplication requires a large amount of total work. Randomized algorithms based on reachability -- the ability to get from one vertex to another along a directed path -- greatly improve the work bound in the average case. However, these algorithms do not always perform well; for instance, Divide-and-Conquer Strong Components (DCSC), a scalable, divide-and-conquer algorithm, has good expected theoretical limits, but can perform very poorly on graphs for which the maximum reachability of any vertex is small. A related algorithm, MultiPivot, gives very high probability guarantees on the total amount of work for all graphs, but this improvement introduces an overhead that increases the average running time. This work introduces SCCMulti, a multi-pivot improvement of DCSC that offers the same consistency as MultiPivot without the time overhead. We provide experimental results demonstrating SCCMulti's scalability; these results also show that SCCMulti is more consistent than DCSC and is always faster than MultiPivot.","PeriodicalId":286119,"journal":{"name":"ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115787117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Initial study of multi-endpoint runtime for MPI+OpenMP hybrid programming model on multi-core systems MPI+OpenMP混合编程模型在多核系统上的多端点运行初探

ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2014-02-06 DOI: 10.1145/2555243.2555287

Miao Luo, Xiaoyi Lu, Khaled Hamidouche, K. Kandalla, D. Panda

引用次数: 8

Parallelizing dynamic programming through rank convergence 基于秩收敛的并行动态规划

ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2014-02-06 DOI: 10.1145/2555243.2555264

Saeed Maleki, M. Musuvathi, Todd Mytkowicz

引用次数: 31

Eliminating global interpreter locks in ruby through hardware transactional memory 通过硬件事务性内存消除ruby中的全局解释器锁

ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2014-02-06 DOI: 10.1145/2555243.2555247

Rei Odaira, J. Castaños, Hisanobu Tomari

{"title":"Eliminating global interpreter locks in ruby through hardware transactional memory","authors":"Rei Odaira, J. Castaños, Hisanobu Tomari","doi":"10.1145/2555243.2555247","DOIUrl":"https://doi.org/10.1145/2555243.2555247","url":null,"abstract":"Many scripting languages use a Global Interpreter Lock (GIL) to simplify the internal designs of their interpreters, but this kind of lock severely lowers the multi-thread per-formance on multi-core machines. This paper presents our first results eliminating the GIL in Ruby using Hardware Transactional Memory (HTM) in the IBM zEnterprise EC12 and Intel 4th Generation Core processors. Though prior prototypes replaced a GIL with HTM, we tested real-istic programs, the Ruby NAS Parallel Benchmarks (NPB), the WEBrick HTTP server, and Ruby on Rails. We devised a new technique to dynamically adjust the transaction lengths on a per-bytecode basis, so that we can optimize the likelihood of transaction aborts against the relative overhead of the instructions to begin and end the transactions. Our results show that HTM achieved 1.9- to 4.4-fold speedups in the NPB programs over the GIL with 12 threads, and 1.6- and 1.2-fold speedups in WEBrick and Ruby on Rails, respectively. The dynamic transaction-length adjustment chose the best transaction lengths for any number of threads and applications with sufficiently long running times.","PeriodicalId":286119,"journal":{"name":"ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124616807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 27

Infrastructure-free logging and replay of concurrent execution on multiple cores 无基础设施的日志记录和多核并发执行的重播

ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2014-02-06 DOI: 10.1145/2555243.2555274

K. H. Lee, Dohyeong Kim, X. Zhang

引用次数: 4

Detecting silent data corruption through data dynamic monitoring for scientific applications 通过科学应用的数据动态监测来检测静默数据损坏

ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2014-02-06 DOI: 10.1145/2555243.2555279

L. Bautista-Gomez, F. Cappello

引用次数: 46

Practical concurrent binary search trees via logical ordering 实用的并行二叉搜索树通过逻辑排序

ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2014-02-06 DOI: 10.1145/2555243.2555269

Dana Drachsler, Martin T. Vechev, Eran Yahav

{"title":"Practical concurrent binary search trees via logical ordering","authors":"Dana Drachsler, Martin T. Vechev, Eran Yahav","doi":"10.1145/2555243.2555269","DOIUrl":"https://doi.org/10.1145/2555243.2555269","url":null,"abstract":"We present practical, concurrent binary search tree (BST) algorithms that explicitly maintain logical ordering information in the data structure, permitting clean separation from its physical tree layout. We capture logical ordering using intervals, with the property that an item belongs to the tree if and only if the item is an endpoint of some interval. We are thus able to construct efficient, synchronization-free and intuitive lookup operations. We present (i) a concurrent non-balanced BST with a lock-free lookup, and (ii) a concurrent AVL tree with a lock-free lookup that requires no synchronization with any mutating operations, including balancing operations. Our algorithms apply on-time deletion; that is, every request for removal of a node, results in its immediate removal from the tree. This new feature did not exist in previous concurrent internal tree algorithms.\u0000 We implemented our concurrent BST algorithms and evaluated them against several state-of-the-art concurrent tree algorithms. Our experimental results show that our algorithms with lock-free contains and on-time deletion are practical and often comparable to the state-of-the-art.","PeriodicalId":286119,"journal":{"name":"ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121032545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 93

Designing and auto-tuning parallel 3-D FFT for computation-communication overlap 计算-通信重叠的并行三维FFT设计与自整定

ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2014-02-06 DOI: 10.1145/2555243.2555249

Sukhyun Song, J. Hollingsworth

引用次数: 19

Concurrency testing using schedule bounding: an empirical study 使用进度边界的并发测试:一个实证研究

ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming Pub Date : 2014-02-06 DOI: 10.1145/2555243.2555260

Paul Thomson, A. Donaldson, A. Betts

{"title":"Concurrency testing using schedule bounding: an empirical study","authors":"Paul Thomson, A. Donaldson, A. Betts","doi":"10.1145/2555243.2555260","DOIUrl":"https://doi.org/10.1145/2555243.2555260","url":null,"abstract":"We present the first independent empirical study on schedule bounding techniques for systematic concurrency testing (SCT). We have gathered 52 buggy concurrent software benchmarks, drawn from public code bases, which we call SCTBench. We applied a modified version of an existing concurrency testing tool to SCTBench to attempt to answer several research questions, including: How effective are the two main schedule bounding techniques, preemption bounding and delay bounding, at bug finding? What challenges are associated with applying SCT to existing code? How effective is schedule bounding compared to a naive random scheduler at finding bugs? Our findings confirm that delay bounding is superior to preemption bounding and that schedule bounding is more effective at finding bugs than unbounded depth-first search. The majority of bugs in SCTBench can be exposed using a small bound (1-3), supporting previous claims, but there is at least one benchmark that requires 5 preemptions. Surprisingly, we found that a naive random scheduler is at least as effective as schedule bounding for finding bugs. We have made SCTBench and our tools publicly available for reproducibility and use in future work.","PeriodicalId":286119,"journal":{"name":"ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132711589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 65