Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation最新文献

Dynamic trace-based analysis of vectorization potential of applications 基于动态轨迹的矢量化潜力分析的应用

Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation Pub Date : 2012-06-11 DOI: 10.1145/2254064.2254108

Justin Holewinski, R. Ramamurthi, Mahesh Ravishankar, Naznin Fauzia, L. Pouchet, A. Rountev, P. Sadayappan

{"title":"Dynamic trace-based analysis of vectorization potential of applications","authors":"Justin Holewinski, R. Ramamurthi, Mahesh Ravishankar, Naznin Fauzia, L. Pouchet, A. Rountev, P. Sadayappan","doi":"10.1145/2254064.2254108","DOIUrl":"https://doi.org/10.1145/2254064.2254108","url":null,"abstract":"Recent hardware trends with GPUs and the increasing vector lengths of SSE-like ISA extensions for multicore CPUs imply that effective exploitation of SIMD parallelism is critical for achieving high performance on emerging and future architectures. A vast majority of existing applications were developed without any attention by their developers towards effective vectorizability of the codes. While developers of production compilers such as GNU gcc, Intel icc, PGI pgcc, and IBM xlc have invested considerable effort and made significant advances in enhancing automatic vectorization capabilities, these compilers still cannot effectively vectorize many existing scientific and engineering codes. It is therefore of considerable interest to analyze existing applications to assess the inherent latent potential for SIMD parallelism, exploitable through further compiler advances and/or via manual code changes. In this paper we develop an approach to infer a program's SIMD parallelization potential by analyzing the dynamic data-dependence graph derived from a sequential execution trace. By considering only the observed run-time data dependences for the trace, and by relaxing the execution order of operations to allow any dependence-preserving reordering, we can detect potential SIMD parallelism that may otherwise be missed by more conservative compile-time analyses. We show that for several benchmarks our tool discovers regions of code within computationally-intensive loops that exhibit high potential for SIMD parallelism but are not vectorized by state-of-the-art compilers. We present several case studies of the use of the tool, both in identifying opportunities to enhance the transformation capabilities of vectorizing compilers, as well as in pointing to code regions to manually modify in order to enable auto-vectorization and performance improvement by existing compilers.","PeriodicalId":308121,"journal":{"name":"Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116076411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 62

A compiler framework for extracting superword level parallelism 一个用于提取超词级并行性的编译器框架

Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation Pub Date : 2012-06-11 DOI: 10.1145/2254064.2254106

Jun Liu, Yuanrui Zhang, Ohyoung Jang, W. Ding, M. Kandemir

{"title":"A compiler framework for extracting superword level parallelism","authors":"Jun Liu, Yuanrui Zhang, Ohyoung Jang, W. Ding, M. Kandemir","doi":"10.1145/2254064.2254106","DOIUrl":"https://doi.org/10.1145/2254064.2254106","url":null,"abstract":"SIMD (single-instruction multiple-data) instruction set extensions are quite common today in both high performance and embedded microprocessors, and enable the exploitation of a specific type of data parallelism called SLP (Superword Level Parallelism). While prior research shows that significant performance savings are possible when SLP is exploited, placing SIMD instructions in an application code manually can be very difficult and error prone. In this paper, we propose a novel automated compiler framework for improving superword level parallelism exploitation. The key part of our framework consists of two stages: superword statement generation and data layout optimization. The first stage is our main contribution and has two phases, statement grouping and statement scheduling, of which the primary goals are to increase SIMD parallelism and, more importantly, capture more superword reuses among the superword statements through global data access and reuse pattern analysis. Further, as a complementary optimization, our data layout optimization organizes data in memory space such that the price of memory operations for SLP is minimized. The results from our compiler implementation and tests on two systems indicate performance improvements as high as 15.2% over a state-of-the-art SLP optimization algorithm.","PeriodicalId":308121,"journal":{"name":"Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122331985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 48

Dynamic synthesis for relaxed memory models 放松记忆模型的动态综合

Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation Pub Date : 2012-06-11 DOI: 10.1145/2254064.2254115

Feng Liu, Nayden Nedev, Nedyalko Prisadnikov, Martin T. Vechev, Eran Yahav

{"title":"Dynamic synthesis for relaxed memory models","authors":"Feng Liu, Nayden Nedev, Nedyalko Prisadnikov, Martin T. Vechev, Eran Yahav","doi":"10.1145/2254064.2254115","DOIUrl":"https://doi.org/10.1145/2254064.2254115","url":null,"abstract":"Modern architectures implement relaxed memory models which may reorder memory operations or execute them non-atomically. Special instructions called memory fences are provided, allowing control of this behavior. To implement a concurrent algorithm for a modern architecture, the programmer is forced to manually reason about subtle relaxed behaviors and figure out ways to control these behaviors by adding fences to the program. Not only is this process time consuming and error-prone, but it has to be repeated every time the implementation is ported to a different architecture. In this paper, we present the first scalable framework for handling real-world concurrent algorithms running on relaxed architectures. Given a concurrent C program, a safety specification, and a description of the memory model, our framework tests the program on the memory model to expose violations of the specification, and synthesizes a set of necessary ordering constraints that prevent these violations. The ordering constraints are then realized as additional fences in the program. We implemented our approach in a tool called DFence based on LLVM and used it to infer fences in a number of concurrent algorithms. Using DFence, we perform the first in-depth study of the interaction between fences in real-world concurrent C programs, correctness criteria such as sequential consistency and linearizability, and memory models such as TSO and PSO, yielding many interesting observations. We believe that this is the first tool that can handle programs at the scale and complexity of a lock-free memory allocator.","PeriodicalId":308121,"journal":{"name":"Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129625835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 78

JANUS: exploiting parallelism via hindsight JANUS:利用后见之明的并行性

Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation Pub Date : 2012-06-11 DOI: 10.1145/2254064.2254083

Omer Tripp, R. Manevich, J. Field, Shmuel Sagiv

{"title":"JANUS: exploiting parallelism via hindsight","authors":"Omer Tripp, R. Manevich, J. Field, Shmuel Sagiv","doi":"10.1145/2254064.2254083","DOIUrl":"https://doi.org/10.1145/2254064.2254083","url":null,"abstract":"This paper addresses the problem of reducing unnecessary conflicts in optimistic synchronization. Optimistic synchronization must ensure that any two concurrently executing transactions that commit are properly synchronized. Conflict detection is an approximate check for this condition. For efficiency, the traditional approach to conflict detection conservatively checks that the memory locations mutually accessed by two concurrent transactions are accessed only for reading. We present JANUS, a parallelization system that performs conflict detection by considering sequences of operations and their composite effect on the system's state. This is done efficiently, such that the runtime overhead due to conflict detection is on a par with that of write-conflict-based detection. In certain common scenarios, this mode of refinement dramatically improves the precision of conflict detection, thereby reducing the number of false conflicts. Our empirical evaluation of JANUS shows that this precision gain reduces the abort rate by an order of magnitude (22x on average), and achieves a speedup of up to 2.5x, on a suite of real-world benchmarks where no parallelism is exploited by the standard approach.","PeriodicalId":308121,"journal":{"name":"Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"794 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123905325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

RockSalt: better, faster, stronger SFI for the x86 RockSalt:更好、更快、更强的x86 SFI

Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation Pub Date : 2012-06-11 DOI: 10.1145/2254064.2254111

Greg Morrisett, Gang Tan, Joseph Tassarotti, Jean-Baptiste Tristan, Edward Gan

引用次数: 141

Automated error diagnosis using abductive inference 使用溯因推理的自动错误诊断

Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation Pub Date : 2012-06-11 DOI: 10.1145/2254064.2254087

Işıl Dillig, Thomas Dillig, A. Aiken

{"title":"Automated error diagnosis using abductive inference","authors":"Işıl Dillig, Thomas Dillig, A. Aiken","doi":"10.1145/2254064.2254087","DOIUrl":"https://doi.org/10.1145/2254064.2254087","url":null,"abstract":"When program verification tools fail to verify a program, either the program is buggy or the report is a false alarm. In this situation, the burden is on the user to manually classify the report, but this task is time-consuming, error-prone, and does not utilize facts already proven by the analysis. We present a new technique for assisting users in classifying error reports. Our technique computes small, relevant queries presented to a user that capture exactly the information the analysis is missing to either discharge or validate the error. Our insight is that identifying these missing facts is an instance of the abductive inference problem in logic, and we present a new algorithm for computing the smallest and most general abductions in this setting. We perform the first user study to rigorously evaluate the accuracy and effort involved in manual classification of error reports. Our study demonstrates that our new technique is very useful for improving both the speed and accuracy of error report classification. Specifically, our approach improves classification accuracy from 33% to 90% and reduces the time programmers take to classify error reports from approximately 5 minutes to under 1 minute.","PeriodicalId":308121,"journal":{"name":"Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116280652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 100

Understanding and detecting real-world performance bugs 理解和检测真实世界的性能错误

Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation Pub Date : 2012-06-11 DOI: 10.1145/2254064.2254075

Guoliang Jin, Linhai Song, Xiaoming Shi, Joel Scherpelz, Shan Lu

引用次数: 346

Synthesising graphics card programs from DSLs 从dsl合成图形卡程序

Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation Pub Date : 2012-06-11 DOI: 10.1145/2254064.2254080

Luke Cartey, Rune B. Lyngsø, O. Moor

引用次数: 8

Session details: Performance analysis 会话详细信息:性能分析

Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation Pub Date : 2012-06-11 DOI: 10.1145/3250580

Amer Diwan

引用次数: 0

Test-case reduction for C compiler bugs 减少C编译器错误的测试用例

Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation Pub Date : 2012-06-11 DOI: 10.1145/2254064.2254104

J. Regehr, Yang Chen, Pascal Cuoq, E. Eide, Chucky Ellison, Xuejun Yang

引用次数: 255