GAROS: Genetic algorithm-aided row-skipping for shift and duplicate kernel mapping in processing-in-memory architectures

IF 3.7 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Johnny Rhe , Kang Eun Jeon , Jong Hwan Ko
{"title":"GAROS: Genetic algorithm-aided row-skipping for shift and duplicate kernel mapping in processing-in-memory architectures","authors":"Johnny Rhe ,&nbsp;Kang Eun Jeon ,&nbsp;Jong Hwan Ko","doi":"10.1016/j.sysarc.2025.103423","DOIUrl":null,"url":null,"abstract":"<div><div>Processing-in-memory (PIM) architecture is becoming a promising candidate for convolutional neural network (CNN) inference. A recent mapping method, shift and duplicate kernel (SDK), enhances latency by improving array utilization through shifting the same kernels into idle columns. Although pattern-based pruning effectively enables row-skipping, traditional pattern designs are suboptimal for SDK mapping due to the irregular kernel shifts, complicating row-skipping. To address this, we proposed pruning-aided row-skipping (PAIRS), which adopts SDK-optimized layer-wise patterns. However, PAIRS has two key limitations: it offers discrete row-skipping by using single pattern set, restricting precise control over the weight matrix compression for varying layer and array sizes, and it risks accuracy loss by pruning critical weights. To overcome these challenges, we introduce genetic algorithm-aided row-skipping (GAROS), which employs input channel (IC)-wise patterns. GAROS enables finer control over row-skipping by assigning several pattern sets and selecting optimal patterns to each IC for preserving critical weights. Consequently, this approach enables continuous weight matrix compression while balancing the trade-off between row-skipping and accuracy. Simulation results in WRN16-4 demonstrate that GAROS improved accuracy by up to +2.4% compared to PAIRS and achieved up to a 1.74<span><math><mo>×</mo></math></span> speedup compared to baseline when 128 × 128 sub-array is used.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"165 ","pages":"Article 103423"},"PeriodicalIF":3.7000,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Systems Architecture","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1383762125000955","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

Processing-in-memory (PIM) architecture is becoming a promising candidate for convolutional neural network (CNN) inference. A recent mapping method, shift and duplicate kernel (SDK), enhances latency by improving array utilization through shifting the same kernels into idle columns. Although pattern-based pruning effectively enables row-skipping, traditional pattern designs are suboptimal for SDK mapping due to the irregular kernel shifts, complicating row-skipping. To address this, we proposed pruning-aided row-skipping (PAIRS), which adopts SDK-optimized layer-wise patterns. However, PAIRS has two key limitations: it offers discrete row-skipping by using single pattern set, restricting precise control over the weight matrix compression for varying layer and array sizes, and it risks accuracy loss by pruning critical weights. To overcome these challenges, we introduce genetic algorithm-aided row-skipping (GAROS), which employs input channel (IC)-wise patterns. GAROS enables finer control over row-skipping by assigning several pattern sets and selecting optimal patterns to each IC for preserving critical weights. Consequently, this approach enables continuous weight matrix compression while balancing the trade-off between row-skipping and accuracy. Simulation results in WRN16-4 demonstrate that GAROS improved accuracy by up to +2.4% compared to PAIRS and achieved up to a 1.74× speedup compared to baseline when 128 × 128 sub-array is used.
在内存处理体系结构中用于移位和重复核映射的遗传算法辅助行跳转
内存中处理(PIM)架构正在成为卷积神经网络(CNN)推理的一个有前途的候选者。最近的一种映射方法,即迁移和复制内核(SDK),通过将相同的内核转移到空闲列中来提高阵列利用率,从而增强了延迟。尽管基于模式的剪枝可以有效地实现行跳转,但是传统的模式设计对于SDK映射来说不是最优的,因为内核的不规则移位会使行跳转变得复杂。为了解决这个问题,我们提出了修剪辅助跳行(PAIRS),它采用了sdk优化的分层模式。然而,PAIRS有两个关键的限制:它通过使用单个模式集提供离散的行跳过,限制了对不同层和数组大小的权重矩阵压缩的精确控制,并且由于修剪关键权重而有准确性损失的风险。为了克服这些挑战,我们引入了遗传算法辅助跳行(GAROS),它采用了输入通道(IC)智能模式。GAROS通过为每个IC分配多个模式集和选择最佳模式来保持关键权重,从而更好地控制行跳。因此,这种方法可以实现连续的权重矩阵压缩,同时平衡行跳转和精度之间的权衡。WRN16-4中的仿真结果表明,当使用128 × 128子阵列时,GAROS的精度比PAIRS提高了+2.4%,速度比基线提高了1.74倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Systems Architecture
Journal of Systems Architecture 工程技术-计算机:硬件
CiteScore
8.70
自引率
15.60%
发文量
226
审稿时长
46 days
期刊介绍: The Journal of Systems Architecture: Embedded Software Design (JSA) is a journal covering all design and architectural aspects related to embedded systems and software. It ranges from the microarchitecture level via the system software level up to the application-specific architecture level. Aspects such as real-time systems, operating systems, FPGA programming, programming languages, communications (limited to analysis and the software stack), mobile systems, parallel and distributed architectures as well as additional subjects in the computer and system architecture area will fall within the scope of this journal. Technology will not be a main focus, but its use and relevance to particular designs will be. Case studies are welcome but must contribute more than just a design for a particular piece of software. Design automation of such systems including methodologies, techniques and tools for their design as well as novel designs of software components fall within the scope of this journal. Novel applications that use embedded systems are also central in this journal. While hardware is not a part of this journal hardware/software co-design methods that consider interplay between software and hardware components with and emphasis on software are also relevant here.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信