In-Memory Computing Accelerator for Iterative Linear Algebra Solvers

IF 1.4 3区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Rui Liu;Zerun Li;Xiaoyu Zhang;Xiaoming Chen;Yinhe Han;Minghua Tang
{"title":"In-Memory Computing Accelerator for Iterative Linear Algebra Solvers","authors":"Rui Liu;Zerun Li;Xiaoyu Zhang;Xiaoming Chen;Yinhe Han;Minghua Tang","doi":"10.1109/LCA.2025.3563365","DOIUrl":null,"url":null,"abstract":"Iterative linear solvers are a crucial kernel in many numerical analysis problems. The performance and energy efficiency of iterative solvers based on traditional architectures are severely constrained by the memory wall bottleneck. Computing-in-memory (CIM) has the potential to enhance solving efficiency. Existing CIM architectures are mostly customized for specific algorithms and primarily focus on handling fixed-point operations, which makes them difficult to meet the demands of diverse and high-precision applications. In this work, we propose a CIM architecture that natively supports various iterative linear solvers based on floating-point operations. We develop a new instruction set for the accelerator, which can be flexibly combined to implement various iterative solvers. The evaluation results show that, compared with the GPU implementation, our accelerator achieves more than 10.1× speedup and 6.8× energy savings when executing different iterative solvers.","PeriodicalId":51248,"journal":{"name":"IEEE Computer Architecture Letters","volume":"24 1","pages":"161-164"},"PeriodicalIF":1.4000,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Computer Architecture Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10972329/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

Iterative linear solvers are a crucial kernel in many numerical analysis problems. The performance and energy efficiency of iterative solvers based on traditional architectures are severely constrained by the memory wall bottleneck. Computing-in-memory (CIM) has the potential to enhance solving efficiency. Existing CIM architectures are mostly customized for specific algorithms and primarily focus on handling fixed-point operations, which makes them difficult to meet the demands of diverse and high-precision applications. In this work, we propose a CIM architecture that natively supports various iterative linear solvers based on floating-point operations. We develop a new instruction set for the accelerator, which can be flexibly combined to implement various iterative solvers. The evaluation results show that, compared with the GPU implementation, our accelerator achieves more than 10.1× speedup and 6.8× energy savings when executing different iterative solvers.
迭代线性代数求解的内存计算加速器
迭代线性求解是许多数值分析问题的核心。基于传统架构的迭代求解器的性能和能效受到内存墙瓶颈的严重制约。内存计算(CIM)具有提高求解效率的潜力。现有的CIM体系结构大多是针对特定算法定制的,主要侧重于处理定点操作,难以满足多样化和高精度应用的需求。在这项工作中,我们提出了一个CIM架构,该架构支持基于浮点运算的各种迭代线性求解器。我们开发了一种新的加速器指令集,可以灵活地组合实现各种迭代求解。评估结果表明,与GPU实现相比,在执行不同的迭代求解器时,我们的加速器实现了10.1倍以上的加速和6.8倍以上的节能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Computer Architecture Letters
IEEE Computer Architecture Letters COMPUTER SCIENCE, HARDWARE & ARCHITECTURE-
CiteScore
4.60
自引率
4.30%
发文量
29
期刊介绍: IEEE Computer Architecture Letters is a rigorously peer-reviewed forum for publishing early, high-impact results in the areas of uni- and multiprocessor computer systems, computer architecture, microarchitecture, workload characterization, performance evaluation and simulation techniques, and power-aware computing. Submissions are welcomed on any topic in computer architecture, especially but not limited to: microprocessor and multiprocessor systems, microarchitecture and ILP processors, workload characterization, performance evaluation and simulation techniques, compiler-hardware and operating system-hardware interactions, interconnect architectures, memory and cache systems, power and thermal issues at the architecture level, I/O architectures and techniques, independent validation of previously published results, analysis of unsuccessful techniques, domain-specific processor architectures (e.g., embedded, graphics, network, etc.), real-time and high-availability architectures, reconfigurable systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信