Efficient Design Space Exploration for the BOOM Using SAC-Based Reinforcement Learning

IF 3.1 2区工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2025-06-17 DOI:10.1109/TVLSI.2025.3572799

Mingjun Cheng;Shihan Zhang;Xin Zheng;Xian Lin;Huaien Gao;Shuting Cai;Xiaoming Xiong;Bei Yu

{"title":"Efficient Design Space Exploration for the BOOM Using SAC-Based Reinforcement Learning","authors":"Mingjun Cheng;Shihan Zhang;Xin Zheng;Xian Lin;Huaien Gao;Shuting Cai;Xiaoming Xiong;Bei Yu","doi":"10.1109/TVLSI.2025.3572799","DOIUrl":null,"url":null,"abstract":"Design space exploration (DSE) is crucial for optimizing the performance, power, and area (PPA) of CPU microarchitectures (<inline-formula> <tex-math>$\\mu $ </tex-math></inline-formula>-archs). While various machine learning (ML) algorithms have been applied to the <inline-formula> <tex-math>$\\mu $ </tex-math></inline-formula>-arch DSE problem, the potential of reinforcement learning (RL) remains underexplored. In this article, we propose a novel RL-based approach to address the reduced instruction set computer V (RISC-V) CPU <inline-formula> <tex-math>$\\mu $ </tex-math></inline-formula>-arch DSE problem. This approach enables dynamic selection and optimization of <inline-formula> <tex-math>$\\mu $ </tex-math></inline-formula>-arch parameters without relying on predefined modification sequences, thus significantly enhancing exploration flexibility. To address the challenges posed by high-dimensional action spaces and sparse rewards, we use a discrete soft actor-critic (SAC) framework with entropy maximization to promote efficient exploration. In addition, we integrate multistep temporal-difference (TD) learning, an experience replay (ER) buffer, and return normalization to improve sample efficiency and learning stability during training. Our method further aligns optimization with user-defined preferences by normalizing PPA metrics relative to baseline designs. Experimental results on the Berkeley out-of-order machine (BOOM) demonstrate that the proposed approach achieves superior performance compared with state-of-the-art methods, showcasing its effectiveness and efficiency for <inline-formula> <tex-math>$\\mu $ </tex-math></inline-formula>-arch DSE. Our code is available at <uri>https://github.com/exhaust-create/SAC-DSE</uri>.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 8","pages":"2252-2263"},"PeriodicalIF":3.1000,"publicationDate":"2025-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/11038948/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Design space exploration (DSE) is crucial for optimizing the performance, power, and area (PPA) of CPU microarchitectures (

$\mu $

-archs). While various machine learning (ML) algorithms have been applied to the

$\mu $

-arch DSE problem, the potential of reinforcement learning (RL) remains underexplored. In this article, we propose a novel RL-based approach to address the reduced instruction set computer V (RISC-V) CPU

$\mu $

-arch DSE problem. This approach enables dynamic selection and optimization of

$\mu $

-arch parameters without relying on predefined modification sequences, thus significantly enhancing exploration flexibility. To address the challenges posed by high-dimensional action spaces and sparse rewards, we use a discrete soft actor-critic (SAC) framework with entropy maximization to promote efficient exploration. In addition, we integrate multistep temporal-difference (TD) learning, an experience replay (ER) buffer, and return normalization to improve sample efficiency and learning stability during training. Our method further aligns optimization with user-defined preferences by normalizing PPA metrics relative to baseline designs. Experimental results on the Berkeley out-of-order machine (BOOM) demonstrate that the proposed approach achieves superior performance compared with state-of-the-art methods, showcasing its effectiveness and efficiency for

$\mu $

-arch DSE. Our code is available at https://github.com/exhaust-create/SAC-DSE.

查看原文本刊更多论文

基于sac强化学习的BOOM高效设计空间探索

设计空间探索（DSE）对于优化CPU微架构（$\mu $ -arch）的性能、功耗和面积（PPA）至关重要。虽然各种机器学习（ML）算法已应用于$\mu $ -arch DSE问题，但强化学习（RL）的潜力仍未得到充分开发。在本文中，我们提出了一种新的基于rl的方法来解决精简指令集计算机V (RISC-V) CPU $\mu $ -arch DSE问题。该方法可以在不依赖于预定义修改序列的情况下动态选择和优化$\mu $ -arch参数，从而大大提高了勘探的灵活性。为了解决高维行动空间和稀疏奖励带来的挑战，我们使用具有熵最大化的离散软行为者评论家（SAC）框架来促进有效的探索。此外，我们整合了多步时间差（TD）学习、经验回放（ER）缓冲和返回归一化，以提高训练过程中的样本效率和学习稳定性。我们的方法通过规范化相对于基线设计的PPA指标，进一步使优化与用户定义的首选项保持一致。在伯克利乱序机（BOOM）上的实验结果表明，与现有的方法相比，该方法取得了更好的性能，证明了该方法对$\mu $ -arch DSE的有效性和效率。我们的代码可在https://github.com/exhaust-create/SAC-DSE上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Very Large Scale Integration (VLSI) Systems 工程技术-工程：电子与电气

CiteScore

6.40

自引率

7.10%

发文量

187

审稿时长

3.6 months

期刊介绍： The IEEE Transactions on VLSI Systems is published as a monthly journal under the co-sponsorship of the IEEE Circuits and Systems Society, the IEEE Computer Society, and the IEEE Solid-State Circuits Society. Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing and systems applications. Generation of specifications, design and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor and process levels. To address this critical area through a common forum, the IEEE Transactions on VLSI Systems have been founded. The editorial board, consisting of international experts, invites original papers which emphasize and merit the novel systems integration aspects of microelectronic systems including interactions among systems design and partitioning, logic and memory design, digital and analog circuit design, layout synthesis, CAD tools, chips and wafer fabrication, testing and packaging, and systems level qualification. Thus, the coverage of these Transactions will focus on VLSI/ULSI microelectronic systems integration.