基于FPGA的多核硬件加速缓存仿真

Shih-Hao Hung, Yi-Mo Ho, C. Yeh, C. Liu, Chen-Pang Lee
{"title":"基于FPGA的多核硬件加速缓存仿真","authors":"Shih-Hao Hung, Yi-Mo Ho, C. Yeh, C. Liu, Chen-Pang Lee","doi":"10.1145/3264746.3264766","DOIUrl":null,"url":null,"abstract":"Developers often use a virtual platform to develop software before the hardware is available. For software optimization, it is important to profile the cache misses of applications in a realistic operating environment under the virtual platform. In the multicore era, it is hard to simulate the coherence cache miss in a high speed way. In this paper, we propose a hardware-accelerated architecture to simulate the cache misses of a multicore system. We implement the cache miss simulator over a virtual platform with FPGA. Users can profile their software as running over the multicore system. The evaluation shows the throughput achieves 65 MB of trace log per second, when FPGA works in 100 MHz and about 570,000 logic elements are occupied to simulate 4 sets of L1 cache and 1 set of L2 cache in the multicore system with 4 virtual CPUs. The system achieves 1.6 to 2 times of speedup, when comparing with the popular cache miss simulator, Dinero IV. Dinero does less work and does not support coherence cache misses in the multicore system. The evaluation result shows high advantage to speed up the cache miss simulation of the multicore system by the hardware-accelerated architecture as well as FPGA.","PeriodicalId":186790,"journal":{"name":"Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Hardware-accelerated cache simulation for multicore by FPGA\",\"authors\":\"Shih-Hao Hung, Yi-Mo Ho, C. Yeh, C. Liu, Chen-Pang Lee\",\"doi\":\"10.1145/3264746.3264766\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Developers often use a virtual platform to develop software before the hardware is available. For software optimization, it is important to profile the cache misses of applications in a realistic operating environment under the virtual platform. In the multicore era, it is hard to simulate the coherence cache miss in a high speed way. In this paper, we propose a hardware-accelerated architecture to simulate the cache misses of a multicore system. We implement the cache miss simulator over a virtual platform with FPGA. Users can profile their software as running over the multicore system. The evaluation shows the throughput achieves 65 MB of trace log per second, when FPGA works in 100 MHz and about 570,000 logic elements are occupied to simulate 4 sets of L1 cache and 1 set of L2 cache in the multicore system with 4 virtual CPUs. The system achieves 1.6 to 2 times of speedup, when comparing with the popular cache miss simulator, Dinero IV. Dinero does less work and does not support coherence cache misses in the multicore system. The evaluation result shows high advantage to speed up the cache miss simulation of the multicore system by the hardware-accelerated architecture as well as FPGA.\",\"PeriodicalId\":186790,\"journal\":{\"name\":\"Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems\",\"volume\":\"39 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-10-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3264746.3264766\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3264746.3264766","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

开发人员经常在硬件可用之前使用虚拟平台开发软件。对于软件优化,在虚拟平台下的实际操作环境中分析应用程序的缓存丢失是非常重要的。在多核时代,很难以高速的方式模拟相干缓存丢失。在本文中,我们提出了一个硬件加速架构来模拟多核系统的缓存丢失。我们利用FPGA在虚拟平台上实现了缓存丢失模拟器。用户可以将他们的软件配置为在多核系统上运行。评估结果表明,在FPGA工作在100 MHz、占用约57万个逻辑单元的情况下,在4个虚拟cpu的多核系统中模拟4组L1缓存和1组L2缓存时,吞吐量达到65 MB / s。与目前流行的缓存丢失模拟器Dinero IV相比,系统实现了1.6到2倍的加速。Dinero在多核系统中做的工作更少,并且不支持一致性缓存丢失。评估结果表明,采用硬件加速架构和FPGA对多核系统的缓存缺失仿真有很大的加快优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Hardware-accelerated cache simulation for multicore by FPGA
Developers often use a virtual platform to develop software before the hardware is available. For software optimization, it is important to profile the cache misses of applications in a realistic operating environment under the virtual platform. In the multicore era, it is hard to simulate the coherence cache miss in a high speed way. In this paper, we propose a hardware-accelerated architecture to simulate the cache misses of a multicore system. We implement the cache miss simulator over a virtual platform with FPGA. Users can profile their software as running over the multicore system. The evaluation shows the throughput achieves 65 MB of trace log per second, when FPGA works in 100 MHz and about 570,000 logic elements are occupied to simulate 4 sets of L1 cache and 1 set of L2 cache in the multicore system with 4 virtual CPUs. The system achieves 1.6 to 2 times of speedup, when comparing with the popular cache miss simulator, Dinero IV. Dinero does less work and does not support coherence cache misses in the multicore system. The evaluation result shows high advantage to speed up the cache miss simulation of the multicore system by the hardware-accelerated architecture as well as FPGA.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信