基于rram的CIM体系结构中计算卸载的编译工具

IF 1.8 3区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Transactions on Architecture and Code Optimization Pub Date : 2023-09-05 DOI:10.1145/3617686

Hai Jin, Bo Lei, Haikun Liu, Xiaofei Liao, Zhuohui Duan, Chencheng Ye, Yu Zhang

{"title":"基于rram的CIM体系结构中计算卸载的编译工具","authors":"Hai Jin, Bo Lei, Haikun Liu, Xiaofei Liao, Zhuohui Duan, Chencheng Ye, Yu Zhang","doi":"10.1145/3617686","DOIUrl":null,"url":null,"abstract":"Computing-In-Memory (CIM) architectures using Non-Volatile Memories (NVMs) have emerged as a promising way to address the “memory wall” problem in traditional Von Neumann architectures. CIM accelerators can perform arithmetic or Boolean logic operations in NVMs by fully exploiting their high parallelism for bit-wise operations. These accelerators are often used in cooperation with general-purpose processors to speed up a wide variety of artificial neural network applications. In such a heterogeneous computing architecture, the legacy software should be redesigned and re-engineered to utilize new CIM accelerators. In this paper, we propose a compilation tool to automatically migrate legacy programs to such heterogeneous architectures based on the LLVM compiler infrastructure. To accelerate some computations such as vector-matrix multiplication in CIM accelerators, we identify several typical computing patterns from LLVM intermediate representations (IRs), which are oblivious to high-level programming paradigms. Our compilation tool can modify acceleratable LLVM IRs to offload them to CIM accelerators automatically, without re-engineering legacy software. Experimental results show that our compilation tool can translate many legacy programs to CIM-supported binary executables effectively, and improve application performance and energy efficiency by up to 51 × and 309 ×, respectively, compared with general-purpose x86 processors.","PeriodicalId":50920,"journal":{"name":"ACM Transactions on Architecture and Code Optimization","volume":"57 1","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2023-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Compilation Tool for Computation Offloading in ReRAM-based CIM Architectures\",\"authors\":\"Hai Jin, Bo Lei, Haikun Liu, Xiaofei Liao, Zhuohui Duan, Chencheng Ye, Yu Zhang\",\"doi\":\"10.1145/3617686\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Computing-In-Memory (CIM) architectures using Non-Volatile Memories (NVMs) have emerged as a promising way to address the “memory wall” problem in traditional Von Neumann architectures. CIM accelerators can perform arithmetic or Boolean logic operations in NVMs by fully exploiting their high parallelism for bit-wise operations. These accelerators are often used in cooperation with general-purpose processors to speed up a wide variety of artificial neural network applications. In such a heterogeneous computing architecture, the legacy software should be redesigned and re-engineered to utilize new CIM accelerators. In this paper, we propose a compilation tool to automatically migrate legacy programs to such heterogeneous architectures based on the LLVM compiler infrastructure. To accelerate some computations such as vector-matrix multiplication in CIM accelerators, we identify several typical computing patterns from LLVM intermediate representations (IRs), which are oblivious to high-level programming paradigms. Our compilation tool can modify acceleratable LLVM IRs to offload them to CIM accelerators automatically, without re-engineering legacy software. Experimental results show that our compilation tool can translate many legacy programs to CIM-supported binary executables effectively, and improve application performance and energy efficiency by up to 51 × and 309 ×, respectively, compared with general-purpose x86 processors.\",\"PeriodicalId\":50920,\"journal\":{\"name\":\"ACM Transactions on Architecture and Code Optimization\",\"volume\":\"57 1\",\"pages\":\"\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2023-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Architecture and Code Optimization\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3617686\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Architecture and Code Optimization","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3617686","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

使用非易失性存储器(nvm)的内存计算(CIM)体系结构已经成为解决传统冯·诺依曼体系结构中“内存墙”问题的一种有前途的方法。CIM加速器可以在nvm中执行算术或布尔逻辑运算，方法是充分利用nvm对位操作的高并行性。这些加速器通常与通用处理器合作使用，以加速各种人工神经网络应用。在这种异构计算体系结构中，应该重新设计和重新设计遗留软件，以利用新的CIM加速器。在本文中，我们提出了一种基于LLVM编译器基础架构的编译工具，可以自动将遗留程序迁移到这种异构体系结构中。为了加速CIM加速器中的一些计算，例如向量矩阵乘法，我们从LLVM中间表示(ir)中确定了几种典型的计算模式，这些模式与高级编程范例无关。我们的编译工具可以修改可加速的LLVM ir，将它们自动卸载到CIM加速器，而无需重新设计遗留软件。实验结果表明，我们的编译工具可以有效地将许多遗留程序转换为支持cim的二进制可执行文件，与通用x86处理器相比，应用程序性能和能源效率分别提高了51倍和309倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Compilation Tool for Computation Offloading in ReRAM-based CIM Architectures

Computing-In-Memory (CIM) architectures using Non-Volatile Memories (NVMs) have emerged as a promising way to address the “memory wall” problem in traditional Von Neumann architectures. CIM accelerators can perform arithmetic or Boolean logic operations in NVMs by fully exploiting their high parallelism for bit-wise operations. These accelerators are often used in cooperation with general-purpose processors to speed up a wide variety of artificial neural network applications. In such a heterogeneous computing architecture, the legacy software should be redesigned and re-engineered to utilize new CIM accelerators. In this paper, we propose a compilation tool to automatically migrate legacy programs to such heterogeneous architectures based on the LLVM compiler infrastructure. To accelerate some computations such as vector-matrix multiplication in CIM accelerators, we identify several typical computing patterns from LLVM intermediate representations (IRs), which are oblivious to high-level programming paradigms. Our compilation tool can modify acceleratable LLVM IRs to offload them to CIM accelerators automatically, without re-engineering legacy software. Experimental results show that our compilation tool can translate many legacy programs to CIM-supported binary executables effectively, and improve application performance and energy efficiency by up to 51 × and 309 ×, respectively, compared with general-purpose x86 processors.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Transactions on Architecture and Code Optimization 工程技术-计算机：理论方法

CiteScore

3.60

自引率

6.20%

发文量

审稿时长

6-12 weeks

期刊介绍： ACM Transactions on Architecture and Code Optimization (TACO) focuses on hardware, software, and system research spanning the fields of computer architecture and code optimization. Articles that appear in TACO will either present new techniques and concepts or report on experiences and experiments with actual systems. Insights useful to architects, hardware or software developers, designers, builders, and users will be emphasized.