映射实时运动估计类型算法到内存高效,可编程的多处理器架构

E. De Greef, F. Catthoor, H. De Man
{"title":"映射实时运动估计类型算法到内存高效,可编程的多处理器架构","authors":"E. De Greef,&nbsp;F. Catthoor,&nbsp;H. De Man","doi":"10.1016/0165-6074(95)99030-9","DOIUrl":null,"url":null,"abstract":"<div><p>In this paper, an architectural template is presented, which is able to execute the full search motion estimation algorithm or other similar video or image processing algorithms in real time. The architecture is based on a set of programmable video signal processors (VSP's). It is also possible to integrate the processor cores and their local memories on a (set of) chip(s). Due to the programmability, the system is very flexible and can be used for emulation of other similar block-oriented local-neighborhood algorithms. The architecture can be easily divided into several partitions, without data-exchange between partitions. Special attention is paid to memory size and transfer optimization, which are dominant factors for both area and power cost. The trade-offs and techniques used to arrive at these solutions are explained in detail. It is shown that careful optimizations can lead to large savings in memory size (up to 66%) and bandwidth requirements (up to a factor of 4) compared to a straightforward solution.</p></div>","PeriodicalId":100927,"journal":{"name":"Microprocessing and Microprogramming","volume":"41 5","pages":"Pages 409-423"},"PeriodicalIF":0.0000,"publicationDate":"1995-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/0165-6074(95)99030-9","citationCount":"17","resultStr":"{\"title\":\"Mapping real-time motion estimation type algorithms to memory efficient, programmable multi-processor architectures\",\"authors\":\"E. De Greef,&nbsp;F. Catthoor,&nbsp;H. De Man\",\"doi\":\"10.1016/0165-6074(95)99030-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>In this paper, an architectural template is presented, which is able to execute the full search motion estimation algorithm or other similar video or image processing algorithms in real time. The architecture is based on a set of programmable video signal processors (VSP's). It is also possible to integrate the processor cores and their local memories on a (set of) chip(s). Due to the programmability, the system is very flexible and can be used for emulation of other similar block-oriented local-neighborhood algorithms. The architecture can be easily divided into several partitions, without data-exchange between partitions. Special attention is paid to memory size and transfer optimization, which are dominant factors for both area and power cost. The trade-offs and techniques used to arrive at these solutions are explained in detail. It is shown that careful optimizations can lead to large savings in memory size (up to 66%) and bandwidth requirements (up to a factor of 4) compared to a straightforward solution.</p></div>\",\"PeriodicalId\":100927,\"journal\":{\"name\":\"Microprocessing and Microprogramming\",\"volume\":\"41 5\",\"pages\":\"Pages 409-423\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1995-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1016/0165-6074(95)99030-9\",\"citationCount\":\"17\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Microprocessing and Microprogramming\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/0165607495990309\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Microprocessing and Microprogramming","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/0165607495990309","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17

摘要

本文提出了一个架构模板,该模板能够实时执行全搜索运动估计算法或其他类似的视频或图像处理算法。该体系结构基于一组可编程视频信号处理器(VSP)。也可以将处理器核心和它们的本地存储器集成在一组芯片上。由于具有可编程性,该系统具有很强的灵活性,可用于模拟其他类似的面向块的局部邻域算法。该体系结构可以很容易地划分为几个分区,分区之间不需要数据交换。特别注意存储器大小和传输优化,这是面积和功耗成本的主要因素。详细解释了用于实现这些解决方案的权衡和技术。结果表明,与直接的解决方案相比,仔细的优化可以大大节省内存大小(最多66%)和带宽需求(最多4倍)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Mapping real-time motion estimation type algorithms to memory efficient, programmable multi-processor architectures

In this paper, an architectural template is presented, which is able to execute the full search motion estimation algorithm or other similar video or image processing algorithms in real time. The architecture is based on a set of programmable video signal processors (VSP's). It is also possible to integrate the processor cores and their local memories on a (set of) chip(s). Due to the programmability, the system is very flexible and can be used for emulation of other similar block-oriented local-neighborhood algorithms. The architecture can be easily divided into several partitions, without data-exchange between partitions. Special attention is paid to memory size and transfer optimization, which are dominant factors for both area and power cost. The trade-offs and techniques used to arrive at these solutions are explained in detail. It is shown that careful optimizations can lead to large savings in memory size (up to 66%) and bandwidth requirements (up to a factor of 4) compared to a straightforward solution.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信