In-Memory Computing based Machine Learning Accelerators: Opportunities and Challenges

Proceedings of the Great Lakes Symposium on VLSI 2022 Pub Date : 2022-06-06 DOI:10.1145/3526241.3530051

K. Roy

{"title":"In-Memory Computing based Machine Learning Accelerators: Opportunities and Challenges","authors":"K. Roy","doi":"10.1145/3526241.3530051","DOIUrl":null,"url":null,"abstract":"Traditional computing systems based on von Neumann architectures are fundamentally bottle-necked by the transfer speeds between memory and processor. With growing computational needs of today's application space, dominated by Machine Learning (ML) workloads, there is a need to design special purpose computing systems operating on the principle of co-located memory and processing units. Such an approach, commonly known as 'In-memory computing', can potentially eliminate expensive data movement costs by computing inside the memory array itself. To that effect, crossbars based on resistive switching Non-Volatile Memory (NVM) devices has shown immense promise in serving as the building blocks of in-memory computing systems, as their high storage density can overcome scaling challenges that plague CMOS technology today. Adding to that, the ability of resistive crossbars to accelerate the main computational kernel of ML workloads by performing massively parallel, in-situ matrix vector multiplication (MVM) operations, makes them a promising candidate for building area and energy-efficient systems. However, the analog computing nature in resistive crossbars introduce approximations in MVM computations due to device and circuit level nonidealities. Further, analog systems pose high cost peripheral circuit requirements for conversions between the analog and digital domain. Thus, there is a need to understand the entire system design stack, from device characteristics to architectures, and perform effective hardware-software co-design to truly realize the potential of resistive crossbars as future computing systems. In this talk, we will present a comprehensive overview of NVM crossbars for accelerating ML workloads. We describe, in detail, the design principles of the basic building blocks, such as the device and associated circuits, that constitute the crossbars. We explore non-idealities arising from the device characteristics and circuit behavior and study their impact on MVM functionality of NVM crossbars for machine learning hardware.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"68 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Great Lakes Symposium on VLSI 2022","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3526241.3530051","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Traditional computing systems based on von Neumann architectures are fundamentally bottle-necked by the transfer speeds between memory and processor. With growing computational needs of today's application space, dominated by Machine Learning (ML) workloads, there is a need to design special purpose computing systems operating on the principle of co-located memory and processing units. Such an approach, commonly known as 'In-memory computing', can potentially eliminate expensive data movement costs by computing inside the memory array itself. To that effect, crossbars based on resistive switching Non-Volatile Memory (NVM) devices has shown immense promise in serving as the building blocks of in-memory computing systems, as their high storage density can overcome scaling challenges that plague CMOS technology today. Adding to that, the ability of resistive crossbars to accelerate the main computational kernel of ML workloads by performing massively parallel, in-situ matrix vector multiplication (MVM) operations, makes them a promising candidate for building area and energy-efficient systems. However, the analog computing nature in resistive crossbars introduce approximations in MVM computations due to device and circuit level nonidealities. Further, analog systems pose high cost peripheral circuit requirements for conversions between the analog and digital domain. Thus, there is a need to understand the entire system design stack, from device characteristics to architectures, and perform effective hardware-software co-design to truly realize the potential of resistive crossbars as future computing systems. In this talk, we will present a comprehensive overview of NVM crossbars for accelerating ML workloads. We describe, in detail, the design principles of the basic building blocks, such as the device and associated circuits, that constitute the crossbars. We explore non-idealities arising from the device characteristics and circuit behavior and study their impact on MVM functionality of NVM crossbars for machine learning hardware.

查看原文本刊更多论文

基于冯·诺伊曼架构的传统计算系统基本上受到存储器和处理器之间传输速度的瓶颈。随着机器学习(ML)工作负载主导的当今应用程序空间的计算需求不断增长，需要设计基于共存内存和处理单元原理的特殊用途计算系统。这种方法通常被称为“内存计算”，它可以通过在内存数组内部进行计算来潜在地消除昂贵的数据移动成本。为此，基于电阻开关非易失性存储器(NVM)器件的交叉棒在作为内存计算系统的构建模块方面显示出巨大的前景，因为它们的高存储密度可以克服困扰CMOS技术的缩放挑战。此外，电阻交叉杆通过执行大规模并行、原位矩阵向量乘法(MVM)操作来加速机器学习工作负载的主要计算内核的能力，使其成为建筑面积和节能系统的有希望的候选者。然而，由于器件和电路级的非理想性，电阻交叉棒的模拟计算性质在MVM计算中引入了近似。此外，模拟系统对模拟和数字域之间的转换提出了高成本的外围电路要求。因此，有必要了解整个系统设计堆栈，从设备特性到架构，并执行有效的软硬件协同设计，以真正实现电阻交叉杆作为未来计算系统的潜力。在本次演讲中，我们将全面概述用于加速ML工作负载的NVM交叉栏。我们详细描述了基本构建块的设计原则，例如构成交叉杆的设备和相关电路。我们探索了由器件特性和电路行为引起的非理想性，并研究了它们对机器学习硬件中NVM交叉条的MVM功能的影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the Great Lakes Symposium on VLSI 2022

自引率

0.00%

发文量