A configurable logic architecture for dynamic hardware/software partitioning

Roman L. Lysecky, F. Vahid
{"title":"A configurable logic architecture for dynamic hardware/software partitioning","authors":"Roman L. Lysecky, F. Vahid","doi":"10.1109/DATE.2004.1268892","DOIUrl":null,"url":null,"abstract":"In previous work, we showed the benefits and feasibility of having a processor dynamically partition its executing software such that critical software kernels are transparently partitioned to execute as a hardware coprocessor on configurable logic - an approach we call warp processing. The configurable logic place and route step is the most computationally intensive part of such hardware/software partitioning, normally running for many minutes or hours on powerful desktop processors. In contrast, dynamic partitioning requires place and route to execute in just seconds and on a lean embedded processor. We have therefore designed a configurable logic architecture specifically for dynamic hardware/software partitioning. Through experiments with popular benchmarks, we show that by specifically focusing on the goal of software kernel speedup when designing the FPGA architecture, rather than on the more general goal of ASIC prototyping, we can perform place and route for our architecture 50 times faster, using 10,000 times less data memory, and 1,000 times less code memory, than popular commercial tools mapping to commercial configurable logic. Yet, we show that we obtain speedups (2x on average, and as much as 4x) and energy savings (33% on average, and up to 74%) when partitioning even just one loop, which are comparable to commercial tools and fabrics. Thus, our configurable logic architecture represents a good candidate for platforms that will support dynamic hardware/software partitioning, and enables ultra-fast desktop tools for hardware/software partitioning, and even for fast configurable logic design in general.","PeriodicalId":335658,"journal":{"name":"Proceedings Design, Automation and Test in Europe Conference and Exhibition","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"73","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings Design, Automation and Test in Europe Conference and Exhibition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DATE.2004.1268892","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 73

Abstract

In previous work, we showed the benefits and feasibility of having a processor dynamically partition its executing software such that critical software kernels are transparently partitioned to execute as a hardware coprocessor on configurable logic - an approach we call warp processing. The configurable logic place and route step is the most computationally intensive part of such hardware/software partitioning, normally running for many minutes or hours on powerful desktop processors. In contrast, dynamic partitioning requires place and route to execute in just seconds and on a lean embedded processor. We have therefore designed a configurable logic architecture specifically for dynamic hardware/software partitioning. Through experiments with popular benchmarks, we show that by specifically focusing on the goal of software kernel speedup when designing the FPGA architecture, rather than on the more general goal of ASIC prototyping, we can perform place and route for our architecture 50 times faster, using 10,000 times less data memory, and 1,000 times less code memory, than popular commercial tools mapping to commercial configurable logic. Yet, we show that we obtain speedups (2x on average, and as much as 4x) and energy savings (33% on average, and up to 74%) when partitioning even just one loop, which are comparable to commercial tools and fabrics. Thus, our configurable logic architecture represents a good candidate for platforms that will support dynamic hardware/software partitioning, and enables ultra-fast desktop tools for hardware/software partitioning, and even for fast configurable logic design in general.
用于动态硬件/软件分区的可配置逻辑体系结构
在之前的工作中,我们展示了让处理器动态分区其正在执行的软件的好处和可行性,这样关键的软件内核就可以透明地分区,作为硬件协处理器在可配置逻辑上执行——我们将这种方法称为warp处理。可配置逻辑位置和路由步骤是此类硬件/软件分区中计算最密集的部分,通常在功能强大的桌面处理器上运行数分钟或数小时。相比之下,动态分区需要在几秒钟内在一个精简的嵌入式处理器上执行位置和路由。因此,我们设计了一个专门用于动态硬件/软件分区的可配置逻辑体系结构。通过流行的基准测试实验,我们表明,通过在设计FPGA架构时特别关注软件内核加速的目标,而不是更一般的ASIC原型目标,我们可以比流行的商业工具映射到商业可配置逻辑快50倍,使用少10000倍的数据内存和少1000倍的代码内存。然而,我们表明,即使只划分一个循环,我们也可以获得速度提升(平均2倍,最多4倍)和能源节约(平均33%,最高74%),这与商业工具和结构相当。因此,我们的可配置逻辑架构代表了支持动态硬件/软件分区的平台的一个很好的候选平台,并且支持用于硬件/软件分区的超快速桌面工具,甚至支持一般的快速可配置逻辑设计。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信