Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation最新文献_第2页

Optimizing off-chip accesses in multicores 优化多核的片外访问

Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation Pub Date : 2015-06-03 DOI: 10.1145/2737924.2737989

W. Ding, Xulong Tang, M. Kandemir, Yuanrui Zhang, Emre Kultursay

{"title":"Optimizing off-chip accesses in multicores","authors":"W. Ding, Xulong Tang, M. Kandemir, Yuanrui Zhang, Emre Kultursay","doi":"10.1145/2737924.2737989","DOIUrl":"https://doi.org/10.1145/2737924.2737989","url":null,"abstract":"In a network-on-chip (NoC) based manycore architecture, an off-chip data access (main memory access) needs to travel through the on-chip network, spending considerable amount of time within the chip (in addition to the memory access latency). In addition, it contends with on-chip (cache) accesses as both use the same NoC resources. In this paper, focusing on data-parallel, multithreaded applications, we propose a compiler-based off-chip data access localization strategy, which places data elements in the memory space such that an off-chip access traverses a minimum number of links (hops) to reach the memory controller that handles this access. This brings three main benefits. First, the network latency of off-chip accesses gets reduced; second, the network latency of on-chip accesses gets reduced; and finally, the memory latency of off-chip accesses improves, due to reduced queue latencies. We present an experimental evaluation of our optimization strategy using a set of 13 multithreaded application programs under both private and shared last-level caches. The results collected emphasize the importance of optimizing the off-chip data accesses.","PeriodicalId":104101,"journal":{"name":"Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133689694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

Tree dependence analysis 树相关性分析

Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation Pub Date : 2015-06-03 DOI: 10.1145/2737924.2737972

Yusheng Weijiang, S. Balakrishna, Jianqiao Liu, Milind Kulkarni

引用次数: 13

Peer-to-peer affine commitment using bitcoin 使用比特币的点对点仿射承诺

Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation Pub Date : 2015-06-03 DOI: 10.1145/2737924.2737997

Karl Crary, Michael J. Sullivan

引用次数: 26

Type-and-example-directed program synthesis 面向类型和示例的程序合成

Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation Pub Date : 2015-06-03 DOI: 10.1145/2737924.2738007

Peter-Michael Osera, S. Zdancewic

引用次数: 224

Automatically improving accuracy for floating point expressions 自动提高浮点表达式的精度

Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation Pub Date : 2015-06-03 DOI: 10.1145/2737924.2737959

P. Panchekha, Alex Sanchez-Stern, James R. Wilcox, Zachary Tatlock

{"title":"Automatically improving accuracy for floating point expressions","authors":"P. Panchekha, Alex Sanchez-Stern, James R. Wilcox, Zachary Tatlock","doi":"10.1145/2737924.2737959","DOIUrl":"https://doi.org/10.1145/2737924.2737959","url":null,"abstract":"Scientific and engineering applications depend on floating point arithmetic to approximate real arithmetic. This approximation introduces rounding error, which can accumulate to produce unacceptable results. While the numerical methods literature provides techniques to mitigate rounding error, applying these techniques requires manually rearranging expressions and understanding the finer details of floating point arithmetic. We introduce Herbie, a tool which automatically discovers the rewrites experts perform to improve accuracy. Herbie's heuristic search estimates and localizes rounding error using sampled points (rather than static error analysis), applies a database of rules to generate improvements, takes series expansions, and combines improvements for different input regions. We evaluated Herbie on examples from a classic numerical methods textbook, and found that Herbie was able to improve accuracy on each example, some by up to 60 bits, while imposing a median performance overhead of 40%. Colleagues in machine learning have used Herbie to significantly improve the results of a clustering algorithm, and a mathematical library has accepted two patches generated using Herbie.","PeriodicalId":104101,"journal":{"name":"Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129035605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 176

LaminarIR: compile-time queues for structured streams LaminarIR:结构化流的编译时队列

Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation Pub Date : 2015-06-03 DOI: 10.1145/2737924.2737994

Yousun Ko, Bernd Burgstaller, Bernhard Scholz

{"title":"LaminarIR: compile-time queues for structured streams","authors":"Yousun Ko, Bernd Burgstaller, Bernhard Scholz","doi":"10.1145/2737924.2737994","DOIUrl":"https://doi.org/10.1145/2737924.2737994","url":null,"abstract":"Stream programming languages employ FIFO (first-in, first-out) semantics to model data channels between producers and consumers. A FIFO data channel stores tokens in a buffer that is accessed indirectly via read- and write-pointers. This indirect token-access decouples a producer’s write-operations from the read-operations of the consumer, thereby making dataflow implicit. For a compiler, indirect token-access obscures data-dependencies, which renders standard optimizations ineffective and impacts stream program performance negatively. In this paper we propose a transformation for structured stream programming languages such as StreamIt that shifts FIFO buffer management from run-time to compile-time and eliminates splitters and joiners, whose task is to distribute and merge streams. To show the effectiveness of our lowering transformation, we have implemented a StreamIt to C compilation framework. We have developed our own intermediate representation (IR) called LaminarIR, which facilitates the transformation. We report on the enabling effect of the LaminarIR on LLVM’s optimizations, which required the conversion of several standard StreamIt benchmarks from static to randomized input, to prevent computation of partial results at compile-time. We conducted our experimental evaluation on the Intel i7-2600K, AMD Opteron 6378, Intel Xeon Phi 3120A and ARM Cortex-A15 platforms. Our LaminarIR reduces data-communication on average by 35.9% and achieves platform-specific speedups between 3.73x and 4.98x over StreamIt. We reduce memory accesses by more than 60% and achieve energy savings of up to 93.6% on the Intel i7-2600K.","PeriodicalId":104101,"journal":{"name":"Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126364664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Monitoring refinement via symbolic reasoning 通过符号推理监控改进

Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation Pub Date : 2015-06-03 DOI: 10.1145/2737924.2737983

M. Emmi, C. Enea, Jad Hamza

引用次数: 19

Verifying read-copy-update in a logic for weak memory 验证弱内存逻辑中的读-复制-更新

Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation Pub Date : 2015-06-03 DOI: 10.1145/2737924.2737992

Joseph Tassarotti, Derek Dreyer, Viktor Vafeiadis

引用次数: 51

Making numerical program analysis fast 使数值程序分析速度快

Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation Pub Date : 2015-06-03 DOI: 10.1145/2737924.2738000

Gagandeep Singh, Markus Püschel, Martin T. Vechev

{"title":"Making numerical program analysis fast","authors":"Gagandeep Singh, Markus Püschel, Martin T. Vechev","doi":"10.1145/2737924.2738000","DOIUrl":"https://doi.org/10.1145/2737924.2738000","url":null,"abstract":"Numerical abstract domains are a fundamental component in modern static program analysis and are used in a wide range of scenarios (e.g. computing array bounds, disjointness, etc). However, analysis with these domains can be very expensive, deeply affecting the scalability and practical applicability of the static analysis. Hence, it is critical to ensure that these domains are made highly efficient. In this work, we present a complete approach for optimizing the performance of the Octagon numerical abstract domain, a domain shown to be particularly effective in practice. Our optimization approach is based on two key insights: i) the ability to perform online decomposition of the octagons leading to a massive reduction in operation counts, and ii) leveraging classic performance optimizations from linear algebra such as vectorization, locality of reference, scalar replacement and others, for improving the key bottlenecks of the domain. Applying these ideas, we designed new algorithms for the core Octagon operators with better asymptotic runtime than prior work and combined them with the optimization techniques to achieve high actual performance. We implemented our approach in the Octagon operators exported by the popular APRON C library, thus enabling existing static analyzers using APRON to immediately benefit from our work. To demonstrate the performance benefits of our approach, we evaluated our framework on three published static analyzers showing massive speed-ups for the time spent in Octagon analysis (e.g., up to 146x) as well as significant end-to-end program analysis speed-ups (up to 18.7x). Based on these results, we believe that our framework can serve as a new basis for static analysis with the Octagon numerical domain.","PeriodicalId":104101,"journal":{"name":"Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126848098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 41

Blame and coercion: together again for the first time 责备和胁迫:第一次又在一起了

Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation Pub Date : 2015-06-03 DOI: 10.1145/2737924.2737968

Jeremy G. Siek, Peter Thiemann, P. Wadler

{"title":"Blame and coercion: together again for the first time","authors":"Jeremy G. Siek, Peter Thiemann, P. Wadler","doi":"10.1145/2737924.2737968","DOIUrl":"https://doi.org/10.1145/2737924.2737968","url":null,"abstract":"C#, Dart, Pyret, Racket, TypeScript, VB: many recent languages integrate dynamic and static types via gradual typing. We systematically develop three calculi for gradual typing and the relations between them, building on and strengthening previous work. The calculi are: λB, based on the blame calculus of Wadler and Findler (2009); λC, inspired by the coercion calculus of Henglein (1994); λS inspired by the space-efficient calculus of Herman, Tomb, and Flanagan (2006) and the threesome calculus of Siek and Wadler (2010). While λB is little changed from previous work, λC and λS are new. Together, λB, λC, and λS provide a coherent foundation for design, implementation, and optimisation of gradual types. We define translations from λB to λC and from λC to λS. Much previous work lacked proofs of correctness or had weak correctness criteria; here we demonstrate the strongest correctness criterion one could hope for, that each of the translations is fully abstract. Each of the calculi reinforces the design of the others: λC has a particularly simple definition, and the subtle definition of blame safety for λB is justified by the simple definition of blame safety for λC. Our calculus λS is implementation-ready: the first space-efficient calculus that is both straightforward to implement and easy to understand. We give two applications: first, using full abstraction from λC to λS to validate the challenging part of full abstraction between λB and λC; and, second, using full abstraction from λB to λS to easily establish the Fundamental Property of Casts, which required a custom bisimulation and six lemmas in earlier work.","PeriodicalId":104101,"journal":{"name":"Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126919307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 33