The Potential of Dynamic Binary Modification and CPU-FPGA SoCs for Simulation

John Mawer, Oscar Palomar, Cosmin Gorgovan, A. Nisbet, W. Toms, M. Luján
{"title":"The Potential of Dynamic Binary Modification and CPU-FPGA SoCs for Simulation","authors":"John Mawer, Oscar Palomar, Cosmin Gorgovan, A. Nisbet, W. Toms, M. Luján","doi":"10.1109/FCCM.2017.36","DOIUrl":null,"url":null,"abstract":"In this paper we describe a flexible infrastructure that can directly interface unmodified application executables with FPGA hardware acceleration IP in order to 1), facilitate faster computer architecture simulation, and 2), to prototype microarchitecture or accelerator IP. Dynamic binary modification tool plugins are directly interfaced to the application under evaluation via flexible software interfaces provided by a userspace hardware control library that also manages access to a parameterised Bluespec IP library. We demonstrate the potential of our infrastructure with two use cases with unmodified application executables where, 1), an executable is dynamically instrumented to generate load/store and program counter events that are sent to FPGA hardware accelerated in-order microarchitecture pipeline, and memory hierarchy models, and 2), the design of a branch predictor is prototyped using an FPGA. The key features of our infrastructure are the ability to instrument at instruction level granularity, to code exclusively at the user level, and to dynamically discover and use available hardware models at run time, thus, we enable software developers to rapidly investigate and evaluate parameterised Bluespec microarchitecture and accelerator IP models. We present a comparison between our system and GEM5, the industry standard ARM architecture simulator, to demonstrate accuracy and relative performance, even though our system is implemented on an Xilinx Zynq 7000 FPGA board with tightly coupled FPGA and ARM Cortex A9 processors, it outperforms GEM5 running on a Xeon with 32GBs of RAM (400x vs 700x slowdown over native execution).","PeriodicalId":124631,"journal":{"name":"2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FCCM.2017.36","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

In this paper we describe a flexible infrastructure that can directly interface unmodified application executables with FPGA hardware acceleration IP in order to 1), facilitate faster computer architecture simulation, and 2), to prototype microarchitecture or accelerator IP. Dynamic binary modification tool plugins are directly interfaced to the application under evaluation via flexible software interfaces provided by a userspace hardware control library that also manages access to a parameterised Bluespec IP library. We demonstrate the potential of our infrastructure with two use cases with unmodified application executables where, 1), an executable is dynamically instrumented to generate load/store and program counter events that are sent to FPGA hardware accelerated in-order microarchitecture pipeline, and memory hierarchy models, and 2), the design of a branch predictor is prototyped using an FPGA. The key features of our infrastructure are the ability to instrument at instruction level granularity, to code exclusively at the user level, and to dynamically discover and use available hardware models at run time, thus, we enable software developers to rapidly investigate and evaluate parameterised Bluespec microarchitecture and accelerator IP models. We present a comparison between our system and GEM5, the industry standard ARM architecture simulator, to demonstrate accuracy and relative performance, even though our system is implemented on an Xilinx Zynq 7000 FPGA board with tightly coupled FPGA and ARM Cortex A9 processors, it outperforms GEM5 running on a Xeon with 32GBs of RAM (400x vs 700x slowdown over native execution).
动态二进制修改和CPU-FPGA soc仿真的潜力
在本文中,我们描述了一个灵活的基础架构,它可以直接将未经修改的应用程序可执行文件与FPGA硬件加速IP连接起来,以便1)促进更快的计算机体系结构模拟,以及2)原型微体系结构或加速器IP。动态二进制修改工具插件通过用户空间硬件控制库提供的灵活软件接口直接连接到被评估的应用程序,该控制库还管理对参数化的Bluespec IP库的访问。我们通过两个使用未修改的应用程序可执行文件的用例展示了我们的基础架构的潜力,其中,1),可执行文件被动态地检测以生成加载/存储和程序计数器事件,这些事件被发送到FPGA硬件加速的有序微架构管道和内存层次模型,以及2),分支预测器的设计使用FPGA原型。我们的基础设施的关键特征是能够在指令级粒度上进行检测,在用户级专门编码,并在运行时动态发现和使用可用的硬件模型,因此,我们使软件开发人员能够快速调查和评估参数化的Bluespec微架构和加速器IP模型。我们提出了我们的系统和GEM5(行业标准的ARM架构模拟器)之间的比较,以证明准确性和相对性能,即使我们的系统是在Xilinx Zynq 7000 FPGA板上实现的,FPGA和ARM Cortex A9处理器紧密耦合,它优于GEM5在具有32gb RAM的Xeon上运行(比本机执行速度慢400倍vs 700倍)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信