{"title":"If-Convert as Early as You Must","authors":"Dorit Nuzman, A. Zaks, Ziv Ben-Zion","doi":"10.1145/3640537.3641562","DOIUrl":"https://doi.org/10.1145/3640537.3641562","url":null,"abstract":"","PeriodicalId":147184,"journal":{"name":"International Conference on Compiler Construction","volume":"241 ","pages":"26-38"},"PeriodicalIF":0.0,"publicationDate":"2024-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140453262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nikolaos Mavrogeorgis, Christos Vasiladiotis, Pei Mu, Amir Khordadi, Björn Franke, Antonio Barbalace
{"title":"UNIFICO: Thread Migration in Heterogeneous-ISA CPUs without State Transformation","authors":"Nikolaos Mavrogeorgis, Christos Vasiladiotis, Pei Mu, Amir Khordadi, Björn Franke, Antonio Barbalace","doi":"10.1145/3640537.3641565","DOIUrl":"https://doi.org/10.1145/3640537.3641565","url":null,"abstract":"","PeriodicalId":147184,"journal":{"name":"International Conference on Compiler Construction","volume":"87 2","pages":"86-99"},"PeriodicalIF":0.0,"publicationDate":"2024-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140453114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fast and Accurate Context-Aware Basic Block Timing Prediction using Transformers","authors":"A. N. Amalou, Elisa Fromont, Isabelle Puaut","doi":"10.1145/3640537.3641572","DOIUrl":"https://doi.org/10.1145/3640537.3641572","url":null,"abstract":"","PeriodicalId":147184,"journal":{"name":"International Conference on Compiler Construction","volume":"126 ","pages":"227-237"},"PeriodicalIF":0.0,"publicationDate":"2024-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140453282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"From Low-Level Fault Modeling (of a Pipeline Attack) to a Proven Hardening Scheme","authors":"Sébastien Michelland, C. Deleuze, Laure Gonnord","doi":"10.1145/3640537.3641570","DOIUrl":"https://doi.org/10.1145/3640537.3641570","url":null,"abstract":"","PeriodicalId":147184,"journal":{"name":"International Conference on Compiler Construction","volume":"173 1","pages":"174-185"},"PeriodicalIF":0.0,"publicationDate":"2024-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140453320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fast Template-Based Code Generation for MLIR","authors":"Florian Drescher, Alexis Engelke","doi":"10.1145/3640537.3641567","DOIUrl":"https://doi.org/10.1145/3640537.3641567","url":null,"abstract":"Fast compilation is essential for JIT-compilation use cases like dynamic languages or databases as well as development productivity when compiling static languages. Template-based compilation allows fast compilation times, but in existing approaches, templates are generally handwritten, limiting flexibility and causing substantial engineering effort. In this paper, we introduce an approach based on MLIR that derives code templates for the instructions of any dialect automatically ahead-of-time. Template generation re-uses the existing compilation path present in the MLIR lowering of the instructions and thereby inherently supports code generation from different abstraction levels in a single step. Our results on compiling database queries and standard C programs show a compile-time improvement of 10–30x compared to LLVM -O0 with only moderate run-time slow-downs of 1–3x, resulting in an overall improvement of 2x in a JIT-compilation-based database setting.","PeriodicalId":147184,"journal":{"name":"International Conference on Compiler Construction","volume":"202 ","pages":"1-12"},"PeriodicalIF":0.0,"publicationDate":"2024-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140453124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Clog: A Declarative Language for C Static Code Checkers","authors":"Alexandru Dura, Christoph Reichenbach","doi":"10.1145/3640537.3641579","DOIUrl":"https://doi.org/10.1145/3640537.3641579","url":null,"abstract":"","PeriodicalId":147184,"journal":{"name":"International Conference on Compiler Construction","volume":"95 ","pages":"186-197"},"PeriodicalIF":0.0,"publicationDate":"2024-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140453150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FlowProf: Profiling Multi-threaded Programs using Information-Flow","authors":"Ahamed Al Nahian, Brian Demsky","doi":"10.1145/3640537.3641577","DOIUrl":"https://doi.org/10.1145/3640537.3641577","url":null,"abstract":"","PeriodicalId":147184,"journal":{"name":"International Conference on Compiler Construction","volume":"231 ","pages":"137-149"},"PeriodicalIF":0.0,"publicationDate":"2024-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140453206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"APPy: Annotated Parallelism for Python on GPUs","authors":"Tong Zhou, J. Shirako, Vivek Sarkar","doi":"10.1145/3640537.3641575","DOIUrl":"https://doi.org/10.1145/3640537.3641575","url":null,"abstract":"GPUs are increasingly being used used to speed up Python applications in the scientific computing and machine learning domains. Currently, the two common approaches to leveraging GPU acceleration in Python are 1) create a custom native GPU kernel, and import it as a function that can be called from Python; 2) use libraries such as CuPy, which provides pre-defined GPU-implementation-backed tensor operators. The first approach is very flexible but requires tremendous manual effort to create a correct and high performance GPU kernel. While the second approach dramatically improves productivity, it is limited in its generality, as many applications cannot be expressed purely using CuPy’s pre-defined tensor operators. Additionally, redundant memory access can often occur between adjacent tensor operators due to the materialization of intermediate results. In this work, we present APPy (Annotated Parallelism for Python), which enables users to parallelize generic Python loops and tensor expressions for execution on GPUs by adding simple compiler directives (annotations) to Python code. Empirical evaluation on 20 scientific computing kernels from the literature on a server with an AMD Ryzen 7 5800X 8-Core CPU and an NVIDIA RTX 3090 GPU demonstrates that with simple pragmas APPy is able to generate more efficient GPU code and achieves significant geometric mean speedup relative to CuPy (30 × on average), and to three state-of-the-art Python compilers, Numba (8.3 × on average), DaCe-GPU (3.1 × on average) and JAX-GPU (18.8 × on average). CCS","PeriodicalId":147184,"journal":{"name":"International Conference on Compiler Construction","volume":"54 8","pages":"113-125"},"PeriodicalIF":0.0,"publicationDate":"2024-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139960634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}