Efficient Design Space Exploration for the BOOM Using SAC-Based Reinforcement Learning

IF 3.1 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Mingjun Cheng;Shihan Zhang;Xin Zheng;Xian Lin;Huaien Gao;Shuting Cai;Xiaoming Xiong;Bei Yu
{"title":"Efficient Design Space Exploration for the BOOM Using SAC-Based Reinforcement Learning","authors":"Mingjun Cheng;Shihan Zhang;Xin Zheng;Xian Lin;Huaien Gao;Shuting Cai;Xiaoming Xiong;Bei Yu","doi":"10.1109/TVLSI.2025.3572799","DOIUrl":null,"url":null,"abstract":"Design space exploration (DSE) is crucial for optimizing the performance, power, and area (PPA) of CPU microarchitectures (<inline-formula> <tex-math>$\\mu $ </tex-math></inline-formula>-archs). While various machine learning (ML) algorithms have been applied to the <inline-formula> <tex-math>$\\mu $ </tex-math></inline-formula>-arch DSE problem, the potential of reinforcement learning (RL) remains underexplored. In this article, we propose a novel RL-based approach to address the reduced instruction set computer V (RISC-V) CPU <inline-formula> <tex-math>$\\mu $ </tex-math></inline-formula>-arch DSE problem. This approach enables dynamic selection and optimization of <inline-formula> <tex-math>$\\mu $ </tex-math></inline-formula>-arch parameters without relying on predefined modification sequences, thus significantly enhancing exploration flexibility. To address the challenges posed by high-dimensional action spaces and sparse rewards, we use a discrete soft actor-critic (SAC) framework with entropy maximization to promote efficient exploration. In addition, we integrate multistep temporal-difference (TD) learning, an experience replay (ER) buffer, and return normalization to improve sample efficiency and learning stability during training. Our method further aligns optimization with user-defined preferences by normalizing PPA metrics relative to baseline designs. Experimental results on the Berkeley out-of-order machine (BOOM) demonstrate that the proposed approach achieves superior performance compared with state-of-the-art methods, showcasing its effectiveness and efficiency for <inline-formula> <tex-math>$\\mu $ </tex-math></inline-formula>-arch DSE. Our code is available at <uri>https://github.com/exhaust-create/SAC-DSE</uri>.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 8","pages":"2252-2263"},"PeriodicalIF":3.1000,"publicationDate":"2025-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/11038948/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

Design space exploration (DSE) is crucial for optimizing the performance, power, and area (PPA) of CPU microarchitectures ( $\mu $ -archs). While various machine learning (ML) algorithms have been applied to the $\mu $ -arch DSE problem, the potential of reinforcement learning (RL) remains underexplored. In this article, we propose a novel RL-based approach to address the reduced instruction set computer V (RISC-V) CPU $\mu $ -arch DSE problem. This approach enables dynamic selection and optimization of $\mu $ -arch parameters without relying on predefined modification sequences, thus significantly enhancing exploration flexibility. To address the challenges posed by high-dimensional action spaces and sparse rewards, we use a discrete soft actor-critic (SAC) framework with entropy maximization to promote efficient exploration. In addition, we integrate multistep temporal-difference (TD) learning, an experience replay (ER) buffer, and return normalization to improve sample efficiency and learning stability during training. Our method further aligns optimization with user-defined preferences by normalizing PPA metrics relative to baseline designs. Experimental results on the Berkeley out-of-order machine (BOOM) demonstrate that the proposed approach achieves superior performance compared with state-of-the-art methods, showcasing its effectiveness and efficiency for $\mu $ -arch DSE. Our code is available at https://github.com/exhaust-create/SAC-DSE.
基于sac强化学习的BOOM高效设计空间探索
设计空间探索(DSE)对于优化CPU微架构($\mu $ -arch)的性能、功耗和面积(PPA)至关重要。虽然各种机器学习(ML)算法已应用于$\mu $ -arch DSE问题,但强化学习(RL)的潜力仍未得到充分开发。在本文中,我们提出了一种新的基于rl的方法来解决精简指令集计算机V (RISC-V) CPU $\mu $ -arch DSE问题。该方法可以在不依赖于预定义修改序列的情况下动态选择和优化$\mu $ -arch参数,从而大大提高了勘探的灵活性。为了解决高维行动空间和稀疏奖励带来的挑战,我们使用具有熵最大化的离散软行为者评论家(SAC)框架来促进有效的探索。此外,我们整合了多步时间差(TD)学习、经验回放(ER)缓冲和返回归一化,以提高训练过程中的样本效率和学习稳定性。我们的方法通过规范化相对于基线设计的PPA指标,进一步使优化与用户定义的首选项保持一致。在伯克利乱序机(BOOM)上的实验结果表明,与现有的方法相比,该方法取得了更好的性能,证明了该方法对$\mu $ -arch DSE的有效性和效率。我们的代码可在https://github.com/exhaust-create/SAC-DSE上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
6.40
自引率
7.10%
发文量
187
审稿时长
3.6 months
期刊介绍: The IEEE Transactions on VLSI Systems is published as a monthly journal under the co-sponsorship of the IEEE Circuits and Systems Society, the IEEE Computer Society, and the IEEE Solid-State Circuits Society. Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing and systems applications. Generation of specifications, design and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor and process levels. To address this critical area through a common forum, the IEEE Transactions on VLSI Systems have been founded. The editorial board, consisting of international experts, invites original papers which emphasize and merit the novel systems integration aspects of microelectronic systems including interactions among systems design and partitioning, logic and memory design, digital and analog circuit design, layout synthesis, CAD tools, chips and wafer fabrication, testing and packaging, and systems level qualification. Thus, the coverage of these Transactions will focus on VLSI/ULSI microelectronic systems integration.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信