{"title":"Big-Computing and Little-Storing STT-MRAM PIM Architecture With Charge Domain Based MAC Operation","authors":"Yunho Jang;Dongsu Kim;Yeseul Kim;Jongsun Park","doi":"10.1109/TC.2024.3517754","DOIUrl":null,"url":null,"abstract":"Spin transfer torque magnetic random access memory (STT-MRAM) is a promising memory technology for processing in memory (PIM) thanks to its high endurance and relatively low device-to-device and cycle-to-cycle variations. However, the low OFF/ON ratio of STT device limits the number of active row-lines during multiply-accumulate (MAC) operations, degrading energy efficiency and computation speed. In this paper, we present an energy efficient and high speed Big-computing and Little-storing STT-MRAM PIM (BCLS-SP) architecture, which can increase the number of active row-lines with almost no area overhead. In the BCLS-SP architecture, a charge domain-based STT-MRAM PIM (CD-SP) structure is employed to concurrently activate many row-lines by improving MAC operation reliability. Filter-wise weight compression (FWC) and weight sharing (WS) are also devised to compress the weights stored in CD-SP, thus reducing area cost. In addition, the proposed architecture performs MAC operations with skipping zero-valued input (SZI) and zero-conversion scheme (ZCS) for better energy efficiency and performance. The simulations using 28nm CMOS process show that the BCLS-SP architecture shows energy reduction of 29% and performance improvement of 3.6 compared to the recent memristive device-based PIM using weight compression and input skipping.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 4","pages":"1239-1252"},"PeriodicalIF":3.6000,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computers","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10803072/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Spin transfer torque magnetic random access memory (STT-MRAM) is a promising memory technology for processing in memory (PIM) thanks to its high endurance and relatively low device-to-device and cycle-to-cycle variations. However, the low OFF/ON ratio of STT device limits the number of active row-lines during multiply-accumulate (MAC) operations, degrading energy efficiency and computation speed. In this paper, we present an energy efficient and high speed Big-computing and Little-storing STT-MRAM PIM (BCLS-SP) architecture, which can increase the number of active row-lines with almost no area overhead. In the BCLS-SP architecture, a charge domain-based STT-MRAM PIM (CD-SP) structure is employed to concurrently activate many row-lines by improving MAC operation reliability. Filter-wise weight compression (FWC) and weight sharing (WS) are also devised to compress the weights stored in CD-SP, thus reducing area cost. In addition, the proposed architecture performs MAC operations with skipping zero-valued input (SZI) and zero-conversion scheme (ZCS) for better energy efficiency and performance. The simulations using 28nm CMOS process show that the BCLS-SP architecture shows energy reduction of 29% and performance improvement of 3.6 compared to the recent memristive device-based PIM using weight compression and input skipping.
期刊介绍:
The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field. It publishes papers on research in areas of current interest to the readers. These areas include, but are not limited to, the following: a) computer organizations and architectures; b) operating systems, software systems, and communication protocols; c) real-time systems and embedded systems; d) digital devices, computer components, and interconnection networks; e) specification, design, prototyping, and testing methods and tools; f) performance, fault tolerance, reliability, security, and testability; g) case studies and experimental and theoretical evaluations; and h) new and important applications and trends.