International Conference on Compiler Construction最新文献

If-Convert as Early as You Must 如果-必须尽早转换

International Conference on Compiler Construction Pub Date : 2024-02-17 DOI: 10.1145/3640537.3641562

Dorit Nuzman, A. Zaks, Ziv Ben-Zion

引用次数: 0

CoSense: Compiler Optimizations using Sensor Technical Specifications CoSense：利用传感器技术规格优化编译器

International Conference on Compiler Construction Pub Date : 2024-02-17 DOI: 10.1145/3640537.3641576

Pei Mu, Nikolaos Mavrogeorgis, Christos Vasiladiotis, Vasileios Tsoutsouras, Orestis Kaparounakis, Phillip Stanley-Marbell, Antonio Barbalace

引用次数: 1

From Low-Level Fault Modeling (of a Pipeline Attack) to a Proven Hardening Scheme 从低级故障建模（管道攻击）到成熟的加固方案

International Conference on Compiler Construction Pub Date : 2024-02-17 DOI: 10.1145/3640537.3641570

Sébastien Michelland, C. Deleuze, Laure Gonnord

引用次数: 0

UNIFICO: Thread Migration in Heterogeneous-ISA CPUs without State Transformation UNIFICO：异构-ISA CPU 中的线程迁移，无需状态转换

International Conference on Compiler Construction Pub Date : 2024-02-17 DOI: 10.1145/3640537.3641565

Nikolaos Mavrogeorgis, Christos Vasiladiotis, Pei Mu, Amir Khordadi, Björn Franke, Antonio Barbalace

引用次数: 0

Fast and Accurate Context-Aware Basic Block Timing Prediction using Transformers 使用变压器进行快速、准确的上下文感知基本模块时序预测

International Conference on Compiler Construction Pub Date : 2024-02-17 DOI: 10.1145/3640537.3641572

A. N. Amalou, Elisa Fromont, Isabelle Puaut

引用次数: 0

Fast Template-Based Code Generation for MLIR 基于模板的多通道红外快速代码生成

International Conference on Compiler Construction Pub Date : 2024-02-17 DOI: 10.1145/3640537.3641567

Florian Drescher, Alexis Engelke

引用次数: 0

Clog: A Declarative Language for C Static Code Checkers Clog：C 静态代码检查程序的声明式语言

International Conference on Compiler Construction Pub Date : 2024-02-17 DOI: 10.1145/3640537.3641579

Alexandru Dura, Christoph Reichenbach

引用次数: 0

FlowProf: Profiling Multi-threaded Programs using Information-Flow FlowProf：使用信息流剖析多线程程序

International Conference on Compiler Construction Pub Date : 2024-02-17 DOI: 10.1145/3640537.3641577

Ahamed Al Nahian, Brian Demsky

引用次数: 0

Compiler-Based Memory Encryption for Machine Learning on Commodity Low-Power Devices 基于编译器的机器学习内存加密技术在低功耗商用设备上的应用

International Conference on Compiler Construction Pub Date : 2024-02-17 DOI: 10.1145/3640537.3641564

Kiwan Maeng, Brandon Lucia

引用次数: 0

APPy: Annotated Parallelism for Python on GPUs APPy：GPU 上 Python 的注释并行性

International Conference on Compiler Construction Pub Date : 2024-02-17 DOI: 10.1145/3640537.3641575

Tong Zhou, J. Shirako, Vivek Sarkar

{"title":"APPy: Annotated Parallelism for Python on GPUs","authors":"Tong Zhou, J. Shirako, Vivek Sarkar","doi":"10.1145/3640537.3641575","DOIUrl":"https://doi.org/10.1145/3640537.3641575","url":null,"abstract":"GPUs are increasingly being used used to speed up Python applications in the scientific computing and machine learning domains. Currently, the two common approaches to leveraging GPU acceleration in Python are 1) create a custom native GPU kernel, and import it as a function that can be called from Python; 2) use libraries such as CuPy, which provides pre-defined GPU-implementation-backed tensor operators. The first approach is very flexible but requires tremendous manual effort to create a correct and high performance GPU kernel. While the second approach dramatically improves productivity, it is limited in its generality, as many applications cannot be expressed purely using CuPy’s pre-defined tensor operators. Additionally, redundant memory access can often occur between adjacent tensor operators due to the materialization of intermediate results. In this work, we present APPy (Annotated Parallelism for Python), which enables users to parallelize generic Python loops and tensor expressions for execution on GPUs by adding simple compiler directives (annotations) to Python code. Empirical evaluation on 20 scientific computing kernels from the literature on a server with an AMD Ryzen 7 5800X 8-Core CPU and an NVIDIA RTX 3090 GPU demonstrates that with simple pragmas APPy is able to generate more efficient GPU code and achieves significant geometric mean speedup relative to CuPy (30 × on average), and to three state-of-the-art Python compilers, Numba (8.3 × on average), DaCe-GPU (3.1 × on average) and JAX-GPU (18.8 × on average). CCS","PeriodicalId":147184,"journal":{"name":"International Conference on Compiler Construction","volume":"54 8","pages":"113-125"},"PeriodicalIF":0.0,"publicationDate":"2024-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139960634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0