Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design最新文献

A Unified Forward Error Correction Accelerator for Multi-Mode Turbo, LDPC, and Polar Decoding 用于多模Turbo、LDPC和Polar解码的统一前向纠错加速器

Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design Pub Date : 2022-08-01 DOI: 10.1145/3531437.3539726

Y. Yue, T. Ajayi, Xueyang Liu, Peiwen Xing, Zihan Wang, D. Blaauw, R. Dreslinski, Hun-Seok Kim

引用次数: 1

A Charge Domain P-8T SRAM Compute-In-Memory with Low-Cost DAC/ADC Operation for 4-bit Input Processing 具有低成本DAC/ADC操作的4位输入处理的电荷域P-8T SRAM内存计算

Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design Pub Date : 2022-08-01 DOI: 10.1145/3531437.3539718

Joonhyung Kim, Kyeongho Lee, Jongsun Park

{"title":"A Charge Domain P-8T SRAM Compute-In-Memory with Low-Cost DAC/ADC Operation for 4-bit Input Processing","authors":"Joonhyung Kim, Kyeongho Lee, Jongsun Park","doi":"10.1145/3531437.3539718","DOIUrl":"https://doi.org/10.1145/3531437.3539718","url":null,"abstract":"This paper presents a low cost PMOS-based 8T (P-8T) SRAM Compute-In-Memory (CIM) architecture that efficiently per-forms the multiply-accumulate (MAC) operations between 4-bit input activations and 8-bit weights. First, bit-line (BL) charge-sharing technique is employed to design the low-cost and reliable digital-to-analog conversion of 4-bit input activations in the pro-posed SRAM CIM, where the charge domain analog computing provides variation tolerant and linear MAC outputs. The 16 local arrays are also effectively exploited to implement the analog mul-tiplication unit (AMU) that simultaneously produces 16 multipli-cation results between 4-bit input activations and 1-bit weights. For the hardware cost reduction of analog-to-digital converter (ADC) without sacrificing DNN accuracy, hardware aware system simulations are performed to decide the ADC bit-resolutions and the number of activated rows in the proposed CIM macro. In addition, for the ADC operation, the AMU-based reference col-umns are utilized for generating ADC reference voltages, with which low-cost 4-bit coarse-fine flash ADC has been designed. The 256×80 P-8T SRAM CIM macro implementation using 28nm CMOS process shows that the proposed CIM shows the accuracies of 91.46% and 66.67% with CIFAR-10 and CIFAR-100 dataset, respectively, with the energy efficiency of 50.07-TOPS/W.","PeriodicalId":116486,"journal":{"name":"Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130654354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Visible Light Synchronization for Time-Slotted Energy-Aware Transiently-Powered Communication 时隙能量感知瞬态供电通信的可见光同步

Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design Pub Date : 2022-08-01 DOI: 10.1145/3531437.3539722

A. Torrisi, Maria Doglioni, K. Yıldırım, D. Brunelli

引用次数: 0

RACE: RISC-V SoC for En/decryption Acceleration on the Edge for Homomorphic Computation 竞赛:用于同态计算边缘加密/解密加速的RISC-V SoC

Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design Pub Date : 2022-08-01 DOI: 10.1145/3531437.3539725

Zahra Azad, Guowei Yang, R. Agrawal, Daniel Petrisko, Michael B. Taylor, A. Joshi

{"title":"RACE: RISC-V SoC for En/decryption Acceleration on the Edge for Homomorphic Computation","authors":"Zahra Azad, Guowei Yang, R. Agrawal, Daniel Petrisko, Michael B. Taylor, A. Joshi","doi":"10.1145/3531437.3539725","DOIUrl":"https://doi.org/10.1145/3531437.3539725","url":null,"abstract":"As more and more edge devices connect to the cloud to use its storage and compute capabilities, they bring in security and data privacy concerns. Homomorphic Encryption (HE) is a promising solution to maintain data privacy by enabling computations on the encrypted user data in the cloud. While there has been a lot of work on accelerating HE computation in the cloud, little attention has been paid to optimize the en/decryption on the edge. Therefore, in this paper, we present RACE, a custom-designed area- and energy-efficient SoC for en/decryption of data for HE. Owing to similar operations in en/decryption, RACE unifies the en/decryption datapath to save area. RACE efficiently exploits techniques like memory reuse and data reordering to utilize minimal amount of on-chip memory. We evaluate RACE using a complete RTL design containing a RISC-V processor and our unified accelerator. Our analysis shows that, for the end-to-end en/decryption, using RACE leads to, on average, 48 × to 39729 × (for a wide range of security parameters) more energy-efficient solution than purely using a processor.","PeriodicalId":116486,"journal":{"name":"Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126385709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

FlexiDRAM: A Flexible in-DRAM Framework to Enable Parallel General-Purpose Computation FlexiDRAM:一种灵活的内置dram框架，可实现并行通用计算

Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design Pub Date : 2022-08-01 DOI: 10.1145/3531437.3539721

Ranyang Zhou, A. Roohi, Durga Misra, Shaahin Angizi

引用次数: 3

HOGEye: Neural Approximation of HOG Feature Extraction in RRAM-Based 3D-Stacked Image Sensors HOGEye:基于rram的3d堆叠图像传感器HOG特征提取的神经逼近

Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design Pub Date : 2022-08-01 DOI: 10.1145/3531437.3539706

T. Ma, Weidong Cao, Fei Qiao, Ayan Chakrabarti, Xuan Zhang

{"title":"HOGEye: Neural Approximation of HOG Feature Extraction in RRAM-Based 3D-Stacked Image Sensors","authors":"T. Ma, Weidong Cao, Fei Qiao, Ayan Chakrabarti, Xuan Zhang","doi":"10.1145/3531437.3539706","DOIUrl":"https://doi.org/10.1145/3531437.3539706","url":null,"abstract":"Many computer vision tasks, ranging from recognition to multi-view registration, operate on feature representation of images rather than raw pixel intensities. However, conventional pipelines for obtaining these representations incur significant energy consumption due to pixel-wise analog-to-digital (A/D) conversions and costly storage and computations. In this paper, we propose HOGEye, an efficient near-pixel implementation for a widely-used feature extraction algorithm—Histograms of Oriented Gradients (HOG). HOGEye moves the key but computation-intensive derivative extraction (DE) and histogram generation (HG) steps into the analog domain by applying a novel neural approximation method in a resistive random-access memory (RRAM)-based 3D-stacked image sensor. The co-location of perception (sensor) and computation (DE and HG) and the alleviation of A/D conversions allow HOGEye design to achieve significant energy saving. With negligible detection rate degradation, the entire HOGEye sensor system consumes less than 48μW@30fps for an image resolution of 256 × 256 (equivalent to 24.3pJ/pixel) while the processing part only consumes 14.1pJ/pixel, achieving more than 2.5 × energy efficiency improvement than the state-of-the-art designs.","PeriodicalId":116486,"journal":{"name":"Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115197357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

A Study on Optimizing Pin Accessibility of Standard Cells in the Post-3 nm Node 标准细胞后3nm节点引脚可及性优化研究

Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design Pub Date : 2022-08-01 DOI: 10.1145/3531437.3539707

J. Jeong, Jonghyun Ko, Taigon Song

引用次数: 1

Drift-tolerant Coding to Enhance the Energy Efficiency of Multi-Level-Cell Phase-Change Memory 容漂编码提高多电平单元相变存储器的能量效率

Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design Pub Date : 2022-08-01 DOI: 10.1145/3531437.3539701

Yi-Shen Chen, Yuan-Hao Chang, Tei-Wei Kuo

{"title":"Drift-tolerant Coding to Enhance the Energy Efficiency of Multi-Level-Cell Phase-Change Memory","authors":"Yi-Shen Chen, Yuan-Hao Chang, Tei-Wei Kuo","doi":"10.1145/3531437.3539701","DOIUrl":"https://doi.org/10.1145/3531437.3539701","url":null,"abstract":"Phase-Change Memory (PCM) has emerged as a promising memory and storage technology in recent years, and Multi-Level-Cell (MLC) PCM further reduces the per-bit cost to improve its competitiveness by storing multiple bits in each PCM cell. However, MLC PCM has high energy consumption issue in its write operations. In contrast to existing works that try to enhance the energy efficiency of the physical program&verify strategy for MLC PCM, this work proposes a drift-tolerant coding scheme to enable the fast write operation on MLC PCM without sacrificing any data accuracy. By exploiting the resistance drift and asymmetric write characteristic of PCM cells, the proposed scheme can reduce the write energy consumption of MLC PCM significantly. Meanwhile, a segmentation strategy is proposed to further improve the write performance with our coding scheme. A series of analyses and experiments was conducted to evaluate the capability of the proposed scheme. The results show that the proposed scheme can reduce 6.2–17.1% energy consumption and 3.2–11.3% write latency under six representative benchmarks, compared with the existing well-known schemes.","PeriodicalId":116486,"journal":{"name":"Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127956273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Evolving Skyrmion Racetrack Memory as Energy-Efficient Last-Level Cache Devices 不断发展的Skyrmion赛道存储器作为节能的最后一级缓存设备

Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design Pub Date : 2022-08-01 DOI: 10.1145/3531437.3539709

Ya-Hui Yang, Shuo-Han Chen, Yuan-Hao Chang

{"title":"Evolving Skyrmion Racetrack Memory as Energy-Efficient Last-Level Cache Devices","authors":"Ya-Hui Yang, Shuo-Han Chen, Yuan-Hao Chang","doi":"10.1145/3531437.3539709","DOIUrl":"https://doi.org/10.1145/3531437.3539709","url":null,"abstract":"Skyrmion racetrack memory (SK-RM) has been regarded as a promising alternative to replace static random-access memory (SRAM) as a large-size on-chip cache device with high memory density. Different from other nonvolatile random-access memories (NVRAMs), data bits of SK-RM can only be altered or detected at access ports, and shift operations are required to move data bits across access ports along the racetrack. Owing to these special characteristics, word-based mapping and bit-interleaved mapping architectures have been proposed to facilitate reading and writing on SK-RM with different data layouts. Nevertheless, when SK-RM is used as an on-chip cache device, existing mapping architectures lead to the concerns of unpredictable access performance or excessive energy consumption during both data reads and writes. To resolve such concerns, this paper proposes extracting the merits of existing mapping architectures for allowing SK-RM to seamlessly switch its data update policy by considering the write latency requirement of cache accesses. Promising results have been demonstrated through a series of benchmark-driven experiments.","PeriodicalId":116486,"journal":{"name":"Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126802678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Domain-Specific System-On-Chip Design for Energy Efficient Wearable Edge AI Applications 面向节能可穿戴边缘AI应用的特定领域片上系统设计

Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design Pub Date : 2022-08-01 DOI: 10.1145/3531437.3539711

Yigit Tuncel, A. Krishnakumar, Aishwarya Lekshmi Chithra, Younghyun Kim, Ümit Y. Ogras

引用次数: 0