Flexible Generation of Fast and Accurate Software Performance Simulators From Compact Processor Descriptions

IF 2.7 3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Conrad Foik;Robert Kunzelmann;Daniel Mueller-Gritschneder;Ulf Schlichtmann
{"title":"Flexible Generation of Fast and Accurate Software Performance Simulators From Compact Processor Descriptions","authors":"Conrad Foik;Robert Kunzelmann;Daniel Mueller-Gritschneder;Ulf Schlichtmann","doi":"10.1109/TCAD.2024.3445255","DOIUrl":null,"url":null,"abstract":"To find optimal solutions for modern embedded systems, designers frequently rely on the software performance simulators. These simulators combine an abstract functional description of a processor with a nonfunctional timing model to accurately estimate the processor’s timing while maintaining high simulation speeds. However, current performance simulators either inflexibly target specific processors or sacrifice accuracy or simulation speed. This article presents a new approach to the software performance simulation, combining flexibility with highly accurate estimates and high simulation speed. A code generator converts a compact structural description of the target processor’s pipeline into sets of timing constraints, describing the processor’s instruction execution. Based on these, it generates corresponding scheduling functions and timing variables, representing the availability of the modeled pipeline. The performance estimator uses these components to approximate the processor’s timing based on an instruction trace provided by an instruction set simulator. Results for the state-of-the-art CV32E40P and CVA6 RISC-V processors show an average relative error of 0.0015% and 3.88%, respectively, over a large set of benchmarks. Our approach reaches an average simulation speed of 24 and 15 million instructions per second (MIPS), respectively.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"43 11","pages":"4130-4141"},"PeriodicalIF":2.7000,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10745761","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10745761/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

To find optimal solutions for modern embedded systems, designers frequently rely on the software performance simulators. These simulators combine an abstract functional description of a processor with a nonfunctional timing model to accurately estimate the processor’s timing while maintaining high simulation speeds. However, current performance simulators either inflexibly target specific processors or sacrifice accuracy or simulation speed. This article presents a new approach to the software performance simulation, combining flexibility with highly accurate estimates and high simulation speed. A code generator converts a compact structural description of the target processor’s pipeline into sets of timing constraints, describing the processor’s instruction execution. Based on these, it generates corresponding scheduling functions and timing variables, representing the availability of the modeled pipeline. The performance estimator uses these components to approximate the processor’s timing based on an instruction trace provided by an instruction set simulator. Results for the state-of-the-art CV32E40P and CVA6 RISC-V processors show an average relative error of 0.0015% and 3.88%, respectively, over a large set of benchmarks. Our approach reaches an average simulation speed of 24 and 15 million instructions per second (MIPS), respectively.
从紧凑型处理器描述灵活生成快速准确的软件性能模拟器
为了找到现代嵌入式系统的最佳解决方案,设计人员经常依赖软件性能模拟器。这些模拟器将处理器的抽象功能描述与非功能时序模型相结合,在保持较高模拟速度的同时,准确估计处理器的时序。然而,目前的性能模拟器要么缺乏针对特定处理器的灵活性,要么牺牲了精度或模拟速度。本文提出了一种新的软件性能仿真方法,将灵活性与高精度估算和高速仿真结合起来。代码生成器可将目标处理器流水线的紧凑结构描述转换为描述处理器指令执行的时序约束集。在此基础上,生成相应的调度函数和时序变量,代表建模流水线的可用性。性能估算器根据指令集模拟器提供的指令轨迹,利用这些组件对处理器的时序进行近似估算。在大量基准测试中,最先进的 CV32E40P 和 CVA6 RISC-V 处理器的结果显示平均相对误差分别为 0.0015% 和 3.88%。我们的方法的平均仿真速度分别达到每秒 2400 万条和 1500 万条指令 (MIPS)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
5.60
自引率
13.80%
发文量
500
审稿时长
7 months
期刊介绍: The purpose of this Transactions is to publish papers of interest to individuals in the area of computer-aided design of integrated circuits and systems composed of analog, digital, mixed-signal, optical, or microwave components. The aids include methods, models, algorithms, and man-machine interfaces for system-level, physical and logical design including: planning, synthesis, partitioning, modeling, simulation, layout, verification, testing, hardware-software co-design and documentation of integrated circuit and system designs of all complexities. Design tools and techniques for evaluating and designing integrated circuits and systems for metrics such as performance, power, reliability, testability, and security are a focus.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信