循环平铺对加速发动机控制器逻辑的影响

2009 20th IEEE International Conference on Application-specific Systems, Architectures and Processors Pub Date : 2009-07-07 DOI:10.1109/ASAP.2009.21

H. Dutta, J. Zhai, Frank Hannig, J. Teich

{"title":"循环平铺对加速发动机控制器逻辑的影响","authors":"H. Dutta, J. Zhai, Frank Hannig, J. Teich","doi":"10.1109/ASAP.2009.21","DOIUrl":null,"url":null,"abstract":"High computational effort in modern signal and image processing applications often demands for special purpose accelerators in a system on chip (SoC). New high level synthesis methodologies enable the automated design of such programmable or non-programmable accelerators. Loop tiling is a widely used transformation in such methodologies for dimensioning of such accelerators in order to match inherent massive parallelism of considered algorithms with available functional units and processor elements. Innately, the applications are data-flow dominant and have almost no control flow, but the application of tiling techniques has the disadvantage of a more complex control and communication flow. In this paper, we present a methodology for the automatic generation of the control engines of such accelerators. The controller orchestrates the data transfer and computation. The effect of tiling on area, latency, and power overhead of the controller is studied in detail. It is shown that the controller has a substantial overhead of up to 50% in for different tiling and throughput parameters. The energy-delay product is also used as a metric for identifying optimal accelerator designs.","PeriodicalId":202421,"journal":{"name":"2009 20th IEEE International Conference on Application-specific Systems, Architectures and Processors","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Impact of Loop Tiling on the Controller Logic of Acceleration Engines\",\"authors\":\"H. Dutta, J. Zhai, Frank Hannig, J. Teich\",\"doi\":\"10.1109/ASAP.2009.21\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"High computational effort in modern signal and image processing applications often demands for special purpose accelerators in a system on chip (SoC). New high level synthesis methodologies enable the automated design of such programmable or non-programmable accelerators. Loop tiling is a widely used transformation in such methodologies for dimensioning of such accelerators in order to match inherent massive parallelism of considered algorithms with available functional units and processor elements. Innately, the applications are data-flow dominant and have almost no control flow, but the application of tiling techniques has the disadvantage of a more complex control and communication flow. In this paper, we present a methodology for the automatic generation of the control engines of such accelerators. The controller orchestrates the data transfer and computation. The effect of tiling on area, latency, and power overhead of the controller is studied in detail. It is shown that the controller has a substantial overhead of up to 50% in for different tiling and throughput parameters. The energy-delay product is also used as a metric for identifying optimal accelerator designs.\",\"PeriodicalId\":202421,\"journal\":{\"name\":\"2009 20th IEEE International Conference on Application-specific Systems, Architectures and Processors\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-07-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 20th IEEE International Conference on Application-specific Systems, Architectures and Processors\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASAP.2009.21\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 20th IEEE International Conference on Application-specific Systems, Architectures and Processors","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASAP.2009.21","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

在现代信号和图像处理应用中，高计算量通常需要在片上系统(SoC)中使用专用加速器。新的高级综合方法使这种可编程或不可编程加速器的自动化设计成为可能。循环平铺是这种方法中广泛使用的一种转换，用于对加速器进行标注，以便将所考虑的算法的固有大规模并行性与可用的功能单元和处理器元素相匹配。从本质上讲，这些应用程序以数据流为主，几乎没有控制流，但平铺技术的应用具有更复杂的控制和通信流的缺点。在本文中，我们提出了一种自动生成这种加速器的控制引擎的方法。控制器协调数据传输和计算。详细研究了平铺对控制器的面积、延时和功耗的影响。结果表明，对于不同的平铺和吞吐量参数，控制器的开销高达50%。能量延迟积也被用作确定最佳加速器设计的度量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Impact of Loop Tiling on the Controller Logic of Acceleration Engines

High computational effort in modern signal and image processing applications often demands for special purpose accelerators in a system on chip (SoC). New high level synthesis methodologies enable the automated design of such programmable or non-programmable accelerators. Loop tiling is a widely used transformation in such methodologies for dimensioning of such accelerators in order to match inherent massive parallelism of considered algorithms with available functional units and processor elements. Innately, the applications are data-flow dominant and have almost no control flow, but the application of tiling techniques has the disadvantage of a more complex control and communication flow. In this paper, we present a methodology for the automatic generation of the control engines of such accelerators. The controller orchestrates the data transfer and computation. The effect of tiling on area, latency, and power overhead of the controller is studied in detail. It is shown that the controller has a substantial overhead of up to 50% in for different tiling and throughput parameters. The energy-delay product is also used as a metric for identifying optimal accelerator designs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2009 20th IEEE International Conference on Application-specific Systems, Architectures and Processors

自引率

0.00%

发文量