{"title":"Polyhedral auto-transformation with no integer linear programming","authors":"Aravind Acharya, Uday Bondhugula, Albert Cohen","doi":"10.1145/3192366.3192401","DOIUrl":"https://doi.org/10.1145/3192366.3192401","url":null,"abstract":"State-of-the-art algorithms used in automatic polyhedral transformation for parallelization and locality optimization typically rely on Integer Linear Programming (ILP). This poses a scalability issue when scaling to tens or hundreds of statements, and may be disconcerting in production compiler settings. In this work, we consider relaxing integrality in the ILP formulation of the Pluto algorithm, a popular algorithm used to find good affine transformations. We show that the rational solutions obtained from the relaxed LP formulation can easily be scaled to valid integral ones to obtain desired solutions, although with some caveats. We first present formal results connecting the solution of the relaxed LP to the original Pluto ILP. We then show that there are difficulties in realizing the above theoretical results in practice, and propose an alternate approach to overcome those while still leveraging linear programming. Our new approach obtains dramatic compile-time speedups for a range of large benchmarks. While achieving these compile-time improvements, we show that the performance of the transformed code is not sacrificed. Our approach to automatic transformation provides a mean compilation time improvement of 5.6× over state-of-the-art on relevant challenging benchmarks from the NAS PB, SPEC CPU 2006, and PolyBench suites. We also came across situations where prior frameworks failed to find a transformation in a reasonable amount of time, while our new approach did so instantaneously.","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82579964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jeehoon Kang, Yoonseung Kim, Youngju Song, Juneyoung Lee, Sanghoon Park, Mark Dongyeon Shin, Y. Kim, Sungkeun Cho, Joonwon Choi, C. Hur, K. Yi
{"title":"Crellvm: verified credible compilation for LLVM","authors":"Jeehoon Kang, Yoonseung Kim, Youngju Song, Juneyoung Lee, Sanghoon Park, Mark Dongyeon Shin, Y. Kim, Sungkeun Cho, Joonwon Choi, C. Hur, K. Yi","doi":"10.1145/3192366.3192377","DOIUrl":"https://doi.org/10.1145/3192366.3192377","url":null,"abstract":"Production compilers such as GCC and LLVM are large complex software systems, for which achieving a high level of reliability is hard. Although testing is an effective method for finding bugs, it alone cannot guarantee a high level of reliability. To provide a higher level of reliability, many approaches that examine compilers' internal logics have been proposed. However, none of them have been successfully applied to major optimizations of production compilers. This paper presents Crellvm: a verified credible compilation framework for LLVM, which can be used as a systematic way of providing a high level of reliability for major optimizations in LLVM. Specifically, we augment an LLVM optimizer to generate translation results together with their correctness proofs, which can then be checked by a proof checker formally verified in Coq. As case studies, we applied our approach to two major optimizations of LLVM: register promotion mem2reg and global value numbering gvn, having found four new miscompilation bugs (two in each).","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89216114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Koeplinger, Matthew Feldman, R. Prabhakar, Yaqi Zhang, Stefan Hadjis, Ruben Fiszel, Tian Zhao, Luigi Nardi, A. Pedram, C. Kozyrakis, K. Olukotun
{"title":"Spatial: a language and compiler for application accelerators","authors":"D. Koeplinger, Matthew Feldman, R. Prabhakar, Yaqi Zhang, Stefan Hadjis, Ruben Fiszel, Tian Zhao, Luigi Nardi, A. Pedram, C. Kozyrakis, K. Olukotun","doi":"10.1145/3192366.3192379","DOIUrl":"https://doi.org/10.1145/3192366.3192379","url":null,"abstract":"Industry is increasingly turning to reconfigurable architectures like FPGAs and CGRAs for improved performance and energy efficiency. Unfortunately, adoption of these architectures has been limited by their programming models. HDLs lack abstractions for productivity and are difficult to target from higher level languages. HLS tools are more productive, but offer an ad-hoc mix of software and hardware abstractions which make performance optimizations difficult. In this work, we describe a new domain-specific language and compiler called Spatial for higher level descriptions of application accelerators. We describe Spatial's hardware-centric abstractions for both programmer productivity and design performance, and summarize the compiler passes required to support these abstractions, including pipeline scheduling, automatic memory banking, and automated design tuning driven by active machine learning. We demonstrate the language's ability to target FPGAs and CGRAs from common source code. We show that applications written in Spatial are, on average, 42% shorter and achieve a mean speedup of 2.9x over SDAccel HLS when targeting a Xilinx UltraScale+ VU9P FPGA on an Amazon EC2 F1 instance.","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91339441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Timon Gehr, Sasa Misailovic, Petar Tsankov, L. Vanbever, Pascal Wiesmann, Martin T. Vechev
{"title":"Bayonet: probabilistic inference for networks","authors":"Timon Gehr, Sasa Misailovic, Petar Tsankov, L. Vanbever, Pascal Wiesmann, Martin T. Vechev","doi":"10.1145/3192366.3192400","DOIUrl":"https://doi.org/10.1145/3192366.3192400","url":null,"abstract":"Network operators often need to ensure that important probabilistic properties are met, such as that the probability of network congestion is below a certain threshold. Ensuring such properties is challenging and requires both a suitable language for probabilistic networks and an automated procedure for answering probabilistic inference queries. We present Bayonet, a novel approach that consists of: (i) a probabilistic network programming language and (ii) a system that performs probabilistic inference on Bayonet programs. The key insight behind Bayonet is to phrase the problem of probabilistic network reasoning as inference in existing probabilistic languages. As a result, Bayonet directly leverages existing probabilistic inference systems and offers a flexible and expressive interface to operators. We present a detailed evaluation of Bayonet on common network scenarios, such as network congestion, reliability of packet delivery, and others. Our results indicate that Bayonet can express such practical scenarios and answer queries for realistic topology sizes (with up to 30 nodes).","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"101 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74972386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"HHVM JIT: a profile-guided, region-based compiler for PHP and Hack","authors":"Guilherme Ottoni","doi":"10.1145/3192366.3192374","DOIUrl":"https://doi.org/10.1145/3192366.3192374","url":null,"abstract":"Dynamic languages such as PHP, JavaScript, Python, and Ruby have been gaining popularity over the last two decades. A very popular domain for these languages is web development, including server-side development of large-scale websites. As a result, improving the performance of these languages has become more important. Efficiently compiling programs in these languages is challenging, and many popular dynamic languages still lack efficient production-quality implementations. This paper describes the design of the second generation of the HHVM JIT and how it addresses the challenges to efficiently execute PHP and Hack programs. This new design uses profiling to build an aggressive region-based JIT compiler. We discuss the benefits of this approach compared to the more popular method-based and trace-based approaches to compile dynamic languages. Our evaluation running a very large PHP-based code base, the Facebook website, demonstrates the effectiveness of the new JIT design.","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78425085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vikash K. Mansinghka, Ulrich Schaechtle, Shivam Handa, Alexey Radul, Yutian Chen, M. Rinard
{"title":"Probabilistic programming with programmable inference","authors":"Vikash K. Mansinghka, Ulrich Schaechtle, Shivam Handa, Alexey Radul, Yutian Chen, M. Rinard","doi":"10.1145/3192366.3192409","DOIUrl":"https://doi.org/10.1145/3192366.3192409","url":null,"abstract":"We introduce inference metaprogramming for probabilistic programming languages, including new language constructs, a formalism, and the rst demonstration of e ectiveness in practice. Instead of relying on rigid black-box inference algorithms hard-coded into the language implementation as in previous probabilistic programming languages, infer- ence metaprogramming enables developers to 1) dynamically decompose inference problems into subproblems, 2) apply in- ference tactics to subproblems, 3) alternate between incorpo- rating new data and performing inference over existing data, and 4) explore multiple execution traces of the probabilis- tic program at once. Implemented tactics include gradient- based optimization, Markov chain Monte Carlo, variational inference, and sequental Monte Carlo techniques. Inference metaprogramming enables the concise expression of proba- bilistic models and inference algorithms across diverse elds, such as computer vision, data science, and robotics, within a single probabilistic programming language.","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"32 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82660220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CURD: a dynamic CUDA race detector","authors":"Yuanfeng Peng, Vinod Grover, Joseph Devietti","doi":"10.1145/3192366.3192368","DOIUrl":"https://doi.org/10.1145/3192366.3192368","url":null,"abstract":"As GPUs have become an integral part of nearly every pro- cessor, GPU programming has become increasingly popular. GPU programming requires a combination of extreme levels of parallelism and low-level programming, making it easy for concurrency bugs such as data races to arise. These con- currency bugs can be extremely subtle and di cult to debug due to the massive numbers of threads running concurrently on a modern GPU. While some tools exist to detect data races in GPU pro- grams, they are often prohibitively slow or focused only on a small class of data races in shared memory. Compared to prior work, our race detector, CURD, can detect data races precisely on both shared and global memory, selects an appropriate race detection algorithm based on the synchronization used in a program, and utilizes efficient compiler instrumentation to reduce performance overheads. Across 53 benchmarks, we find that using CURD incurs an aver- age slowdown of just 2.88x over native execution. CURD is 2.1x faster than Nvidia’s CUDA-Racecheck race detector, de- spite detecting a much broader class of races. CURD finds 35 races across our benchmarks, including bugs in established benchmark suites and in sample programs from Nvidia.","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83632273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","authors":"","doi":"10.1145/3192366","DOIUrl":"https://doi.org/10.1145/3192366","url":null,"abstract":"","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"29 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74649834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stephen Dolan, K. Sivaramakrishnan, Anil Madhavapeddy
{"title":"Bounding data races in space and time","authors":"Stephen Dolan, K. Sivaramakrishnan, Anil Madhavapeddy","doi":"10.1145/3192366.3192421","DOIUrl":"https://doi.org/10.1145/3192366.3192421","url":null,"abstract":"We propose a new semantics for shared-memory parallel programs that gives strong guarantees even in the presence of data races. Our local data race freedom property guarantees that all data-race-free portions of programs exhibit sequential semantics. We provide a straightforward operational semantics and an equivalent axiomatic model, and evaluate an implementation for the OCaml programming language. Our evaluation demonstrates that it is possible to balance a comprehensible memory model with a reasonable (no overhead on x86, ~0.6% on ARM) sequential performance trade-off in a mainstream programming language.","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"191 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79577052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rumen Paletov, Petar Tsankov, Veselin Raychev, Martin T. Vechev
{"title":"Inferring crypto API rules from code changes","authors":"Rumen Paletov, Petar Tsankov, Veselin Raychev, Martin T. Vechev","doi":"10.1145/3192366.3192403","DOIUrl":"https://doi.org/10.1145/3192366.3192403","url":null,"abstract":"Creating and maintaining an up-to-date set of security rules that match misuses of crypto APIs is challenging, as crypto APIs constantly evolve over time with new cryptographic primitives and settings, making existing ones obsolete. To address this challenge, we present a new approach to extract security fixes from thousands of code changes. Our approach consists of: (i) identifying code changes, which often capture security fixes, (ii) an abstraction that filters irrelevant code changes (such as refactorings), and (iii) a clustering analysis that reveals commonalities between semantic code changes and helps in eliciting security rules. We applied our approach to the Java Crypto API and showed that it is effective: (i) our abstraction effectively filters non-semantic code changes (over 99% of all changes) without removing security fixes, and (ii) over 80% of the code changes are security fixes identifying security rules. Based on our results, we identified 13 rules, including new ones not supported by existing security checkers.","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"205 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78074959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}