基于模拟RRAM的CIM系统的块数据流中的直接数据传输

IF 2.1 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

Frontiers in electronics Pub Date : 2023-04-17 DOI:10.3389/felec.2023.1129675

Yuyi Liu, B. Gao, Peng Yao, Qi Liu, Qingtian Zhang, Dong Wu, Jianshi Tang, H. Qian, Huaqiang Wu

{"title":"基于模拟RRAM的CIM系统的块数据流中的直接数据传输","authors":"Yuyi Liu, B. Gao, Peng Yao, Qi Liu, Qingtian Zhang, Dong Wu, Jianshi Tang, H. Qian, Huaqiang Wu","doi":"10.3389/felec.2023.1129675","DOIUrl":null,"url":null,"abstract":"Analog resistive random-access memory (RRAM)-based computation-in-memory (CIM) technology is promising for constructing artificial intelligence (AI) with high energy efficiency and excellent scalability. However, the large overhead of analog-to-digital converters (ADCs) is a key limitation. In this work, we propose a novel LINKAGE architecture that eliminates PE-level ADCs and leverages an analog data transfer module to implement inter-array data processing. A blockwise dataflow is further proposed to accelerate convolutional neural networks (CNNs) to speed up compute-intensive layers and solve the unbalanced pipeline problem. To obtain accurate and reliable benchmark results, key component modules, such as straightforward link (SFL) modules and Tile-level ADCs, are designed in standard 28 nm CMOS technology. The evaluation shows that LINKAGE outperforms the conventional ADC/DAC-based architecture with a 2.07×∼11.22× improvement in throughput, 2.45×∼7.00× in energy efficiency, and 22%–51% reduction in the area overhead while maintaining accuracy. Our LINKAGE architecture can achieve 22.9∼24.4 TOPS/W energy efficiency (4b-IN/4b-W) and 1.82 ∼4.53 TOPS throughput with the blockwise method. This work demonstrates a new method for significantly improving the energy efficiency of CIM chips, which can be applied to general CNNs/FCNNs.","PeriodicalId":73081,"journal":{"name":"Frontiers in electronics","volume":" ","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2023-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Straightforward data transfer in a blockwise dataflow for an analog RRAM-based CIM system\",\"authors\":\"Yuyi Liu, B. Gao, Peng Yao, Qi Liu, Qingtian Zhang, Dong Wu, Jianshi Tang, H. Qian, Huaqiang Wu\",\"doi\":\"10.3389/felec.2023.1129675\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Analog resistive random-access memory (RRAM)-based computation-in-memory (CIM) technology is promising for constructing artificial intelligence (AI) with high energy efficiency and excellent scalability. However, the large overhead of analog-to-digital converters (ADCs) is a key limitation. In this work, we propose a novel LINKAGE architecture that eliminates PE-level ADCs and leverages an analog data transfer module to implement inter-array data processing. A blockwise dataflow is further proposed to accelerate convolutional neural networks (CNNs) to speed up compute-intensive layers and solve the unbalanced pipeline problem. To obtain accurate and reliable benchmark results, key component modules, such as straightforward link (SFL) modules and Tile-level ADCs, are designed in standard 28 nm CMOS technology. The evaluation shows that LINKAGE outperforms the conventional ADC/DAC-based architecture with a 2.07×∼11.22× improvement in throughput, 2.45×∼7.00× in energy efficiency, and 22%–51% reduction in the area overhead while maintaining accuracy. Our LINKAGE architecture can achieve 22.9∼24.4 TOPS/W energy efficiency (4b-IN/4b-W) and 1.82 ∼4.53 TOPS throughput with the blockwise method. This work demonstrates a new method for significantly improving the energy efficiency of CIM chips, which can be applied to general CNNs/FCNNs.\",\"PeriodicalId\":73081,\"journal\":{\"name\":\"Frontiers in electronics\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2023-04-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in electronics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3389/felec.2023.1129675\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in electronics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/felec.2023.1129675","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

基于模拟电阻随机存取存储器（RRAM）的存储器中计算（CIM）技术有望构建具有高能效和良好可扩展性的人工智能。然而，模数转换器（ADC）的大开销是一个关键限制。在这项工作中，我们提出了一种新的LINKAGE架构，该架构消除了PE级ADC，并利用模拟数据传输模块来实现阵列间数据处理。进一步提出了一种分块数据流来加速卷积神经网络（CNNs），以加速计算密集层并解决不平衡管道问题。为了获得准确可靠的基准测试结果，关键组件模块，如直接链路（SFL）模块和平铺级ADC，采用标准28 nm CMOS技术设计。评估表明，LINKAGE在保持精度的同时，吞吐量提高了2.07×～11.22倍，能效提高了2.45×～7.00倍，面积开销降低了22%～51%，优于传统的基于ADC/DAC的架构。我们的LINKAGE架构可以通过分块方法实现22.9～24.4 TOPS/W能效（4b IN/4b-W）和1.82～4.53 TOPS吞吐量。这项工作展示了一种显著提高CIM芯片能效的新方法，该方法可应用于通用的CNNs/FCNN。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Straightforward data transfer in a blockwise dataflow for an analog RRAM-based CIM system

Analog resistive random-access memory (RRAM)-based computation-in-memory (CIM) technology is promising for constructing artificial intelligence (AI) with high energy efficiency and excellent scalability. However, the large overhead of analog-to-digital converters (ADCs) is a key limitation. In this work, we propose a novel LINKAGE architecture that eliminates PE-level ADCs and leverages an analog data transfer module to implement inter-array data processing. A blockwise dataflow is further proposed to accelerate convolutional neural networks (CNNs) to speed up compute-intensive layers and solve the unbalanced pipeline problem. To obtain accurate and reliable benchmark results, key component modules, such as straightforward link (SFL) modules and Tile-level ADCs, are designed in standard 28 nm CMOS technology. The evaluation shows that LINKAGE outperforms the conventional ADC/DAC-based architecture with a 2.07×∼11.22× improvement in throughput, 2.45×∼7.00× in energy efficiency, and 22%–51% reduction in the area overhead while maintaining accuracy. Our LINKAGE architecture can achieve 22.9∼24.4 TOPS/W energy efficiency (4b-IN/4b-W) and 1.82 ∼4.53 TOPS throughput with the blockwise method. This work demonstrates a new method for significantly improving the energy efficiency of CIM chips, which can be applied to general CNNs/FCNNs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Frontiers in electronics

自引率

0.00%

发文量