映射实时运动估计类型算法到内存高效，可编程的多处理器架构

Microprocessing and Microprogramming Pub Date : 1995-10-01 DOI:10.1016/0165-6074(95)99030-9

E. De Greef, F. Catthoor, H. De Man

{"title":"映射实时运动估计类型算法到内存高效，可编程的多处理器架构","authors":"E. De Greef, F. Catthoor, H. De Man","doi":"10.1016/0165-6074(95)99030-9","DOIUrl":null,"url":null,"abstract":"<div><p>In this paper, an architectural template is presented, which is able to execute the full search motion estimation algorithm or other similar video or image processing algorithms in real time. The architecture is based on a set of programmable video signal processors (VSP's). It is also possible to integrate the processor cores and their local memories on a (set of) chip(s). Due to the programmability, the system is very flexible and can be used for emulation of other similar block-oriented local-neighborhood algorithms. The architecture can be easily divided into several partitions, without data-exchange between partitions. Special attention is paid to memory size and transfer optimization, which are dominant factors for both area and power cost. The trade-offs and techniques used to arrive at these solutions are explained in detail. It is shown that careful optimizations can lead to large savings in memory size (up to 66%) and bandwidth requirements (up to a factor of 4) compared to a straightforward solution.</p></div>","PeriodicalId":100927,"journal":{"name":"Microprocessing and Microprogramming","volume":"41 5","pages":"Pages 409-423"},"PeriodicalIF":0.0000,"publicationDate":"1995-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/0165-6074(95)99030-9","citationCount":"17","resultStr":"{\"title\":\"Mapping real-time motion estimation type algorithms to memory efficient, programmable multi-processor architectures\",\"authors\":\"E. De Greef, F. Catthoor, H. De Man\",\"doi\":\"10.1016/0165-6074(95)99030-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>In this paper, an architectural template is presented, which is able to execute the full search motion estimation algorithm or other similar video or image processing algorithms in real time. The architecture is based on a set of programmable video signal processors (VSP's). It is also possible to integrate the processor cores and their local memories on a (set of) chip(s). Due to the programmability, the system is very flexible and can be used for emulation of other similar block-oriented local-neighborhood algorithms. The architecture can be easily divided into several partitions, without data-exchange between partitions. Special attention is paid to memory size and transfer optimization, which are dominant factors for both area and power cost. The trade-offs and techniques used to arrive at these solutions are explained in detail. It is shown that careful optimizations can lead to large savings in memory size (up to 66%) and bandwidth requirements (up to a factor of 4) compared to a straightforward solution.</p></div>\",\"PeriodicalId\":100927,\"journal\":{\"name\":\"Microprocessing and Microprogramming\",\"volume\":\"41 5\",\"pages\":\"Pages 409-423\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1995-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1016/0165-6074(95)99030-9\",\"citationCount\":\"17\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Microprocessing and Microprogramming\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/0165607495990309\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Microprocessing and Microprogramming","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/0165607495990309","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 17

摘要

本文提出了一个架构模板，该模板能够实时执行全搜索运动估计算法或其他类似的视频或图像处理算法。该体系结构基于一组可编程视频信号处理器(VSP)。也可以将处理器核心和它们的本地存储器集成在一组芯片上。由于具有可编程性，该系统具有很强的灵活性，可用于模拟其他类似的面向块的局部邻域算法。该体系结构可以很容易地划分为几个分区，分区之间不需要数据交换。特别注意存储器大小和传输优化，这是面积和功耗成本的主要因素。详细解释了用于实现这些解决方案的权衡和技术。结果表明，与直接的解决方案相比，仔细的优化可以大大节省内存大小(最多66%)和带宽需求(最多4倍)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Mapping real-time motion estimation type algorithms to memory efficient, programmable multi-processor architectures

In this paper, an architectural template is presented, which is able to execute the full search motion estimation algorithm or other similar video or image processing algorithms in real time. The architecture is based on a set of programmable video signal processors (VSP's). It is also possible to integrate the processor cores and their local memories on a (set of) chip(s). Due to the programmability, the system is very flexible and can be used for emulation of other similar block-oriented local-neighborhood algorithms. The architecture can be easily divided into several partitions, without data-exchange between partitions. Special attention is paid to memory size and transfer optimization, which are dominant factors for both area and power cost. The trade-offs and techniques used to arrive at these solutions are explained in detail. It is shown that careful optimizations can lead to large savings in memory size (up to 66%) and bandwidth requirements (up to a factor of 4) compared to a straightforward solution.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Microprocessing and Microprogramming

自引率

0.00%

发文量