加速器丰富的cmp:从概念到实际硬件

2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI:10.1109/ICCD.2013.6657039

Yu-Ting Chen, J. Cong, M. Ghodrat, Muhuan Huang, Chunyue Liu, Bingjun Xiao, Yi Zou

{"title":"加速器丰富的cmp:从概念到实际硬件","authors":"Yu-Ting Chen, J. Cong, M. Ghodrat, Muhuan Huang, Chunyue Liu, Bingjun Xiao, Yi Zou","doi":"10.1109/ICCD.2013.6657039","DOIUrl":null,"url":null,"abstract":"Application-specific accelerators provide 10-100× improvement in power efficiency over general-purpose processors. The accelerator-rich architectures are especially promising. This work discusses a prototype of accelerator-rich CMPs (PARC). During our development of PARC in real hardware, we encountered a set of technical challenges and proposed corresponding solutions. First, we provided system IPs that serve a sea of accelerators to transfer data between userspace and accelerator memories without cache overhead. Second, we designed a dedicated interconnect between accelerators and memories to enable memory sharing. Third, we implemented an accelerator manager to virtualize accelerator resources for users. Finally, we developed an automated flow with a number of IP templates and customizable interfaces to a C-based synthesis flow to enable rapid design and update of PARC. We implemented PARC in a Virtex-6 FPGA chip with integration of platform-specific peripherals and booting of unmodified Linux. Experimental results show that PARC can fully exploit the energy benefits of accelerators at little system overhead.","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"34","resultStr":"{\"title\":\"Accelerator-rich CMPs: From concept to real hardware\",\"authors\":\"Yu-Ting Chen, J. Cong, M. Ghodrat, Muhuan Huang, Chunyue Liu, Bingjun Xiao, Yi Zou\",\"doi\":\"10.1109/ICCD.2013.6657039\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Application-specific accelerators provide 10-100× improvement in power efficiency over general-purpose processors. The accelerator-rich architectures are especially promising. This work discusses a prototype of accelerator-rich CMPs (PARC). During our development of PARC in real hardware, we encountered a set of technical challenges and proposed corresponding solutions. First, we provided system IPs that serve a sea of accelerators to transfer data between userspace and accelerator memories without cache overhead. Second, we designed a dedicated interconnect between accelerators and memories to enable memory sharing. Third, we implemented an accelerator manager to virtualize accelerator resources for users. Finally, we developed an automated flow with a number of IP templates and customizable interfaces to a C-based synthesis flow to enable rapid design and update of PARC. We implemented PARC in a Virtex-6 FPGA chip with integration of platform-specific peripherals and booting of unmodified Linux. Experimental results show that PARC can fully exploit the energy benefits of accelerators at little system overhead.\",\"PeriodicalId\":398811,\"journal\":{\"name\":\"2013 IEEE 31st International Conference on Computer Design (ICCD)\",\"volume\":\"15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-11-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"34\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 IEEE 31st International Conference on Computer Design (ICCD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCD.2013.6657039\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE 31st International Conference on Computer Design (ICCD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCD.2013.6657039","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 34

摘要

特定于应用程序的加速器比通用处理器的功率效率提高了10-100倍。富含加速器的架构尤其有前景。本文讨论了一种富加速器CMPs (PARC)的原型。在实际硬件中开发PARC的过程中，我们遇到了一系列的技术挑战，并提出了相应的解决方案。首先，我们提供了服务于大量加速器的系统ip，以便在没有缓存开销的情况下在用户空间和加速器内存之间传输数据。其次，我们在加速器和内存之间设计了专用互连，以实现内存共享。第三，我们实现了一个加速器管理器，为用户虚拟化加速器资源。最后，我们开发了一个自动化流程，其中包含许多IP模板和基于c的合成流程的可定制接口，以实现PARC的快速设计和更新。我们在一个Virtex-6 FPGA芯片上实现了PARC，该芯片集成了特定于平台的外设和未修改的Linux引导。实验结果表明，PARC可以在很小的系统开销下充分利用加速器的能量优势。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Accelerator-rich CMPs: From concept to real hardware

Application-specific accelerators provide 10-100× improvement in power efficiency over general-purpose processors. The accelerator-rich architectures are especially promising. This work discusses a prototype of accelerator-rich CMPs (PARC). During our development of PARC in real hardware, we encountered a set of technical challenges and proposed corresponding solutions. First, we provided system IPs that serve a sea of accelerators to transfer data between userspace and accelerator memories without cache overhead. Second, we designed a dedicated interconnect between accelerators and memories to enable memory sharing. Third, we implemented an accelerator manager to virtualize accelerator resources for users. Finally, we developed an automated flow with a number of IP templates and customizable interfaces to a C-based synthesis flow to enable rapid design and update of PARC. We implemented PARC in a Virtex-6 FPGA chip with integration of platform-specific peripherals and booting of unmodified Linux. Experimental results show that PARC can fully exploit the energy benefits of accelerators at little system overhead.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2013 IEEE 31st International Conference on Computer Design (ICCD)

自引率

0.00%

发文量