{"title":"基于fpga的图像处理体系结构参数分析框架","authors":"M. Reichenbach, B. Pfundt, D. Fey","doi":"10.1109/SAMOS.2015.7363664","DOIUrl":null,"url":null,"abstract":"Image processing algorithms which only work on a local neighbourhood are nearly used in every image processing application. Very often several iterations are performed on a fixed neighbourhood which leads to the description of stencil codes. A promising approach in embedded systems is to use the massively parallel computation power of an FPGA for this kind of algorithms. This not only speeds up processing time, if the FPGA is directly placed inside the image acquisition unit forming a smart camera, but also reduces or even eliminates the PC based hardware which saves space and power. However, most designers begin from scratch when they have to implement stencil computations into smart cameras. This leads to a not fully utilized FPGA because the most efficient usage of the given resources is only secondary alongside functional correctness. Therefore, we are presenting in this paper a framework for stencil code applications which immediately delivers the best architecture regarding prominent resource criteria. An analytical model is used to find an optimized parameter set (degree of parallelism, usage of buffers, etc.) for a highly flexible FPGA implementation. A graphical tool allows to further evaluate the effects of certain parameters. Our results show, that we are able to create an optimized hardware architecture for this application domain.","PeriodicalId":346802,"journal":{"name":"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"141 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Framework for parameter analysis of FPGA-based image processing architectures\",\"authors\":\"M. Reichenbach, B. Pfundt, D. Fey\",\"doi\":\"10.1109/SAMOS.2015.7363664\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Image processing algorithms which only work on a local neighbourhood are nearly used in every image processing application. Very often several iterations are performed on a fixed neighbourhood which leads to the description of stencil codes. A promising approach in embedded systems is to use the massively parallel computation power of an FPGA for this kind of algorithms. This not only speeds up processing time, if the FPGA is directly placed inside the image acquisition unit forming a smart camera, but also reduces or even eliminates the PC based hardware which saves space and power. However, most designers begin from scratch when they have to implement stencil computations into smart cameras. This leads to a not fully utilized FPGA because the most efficient usage of the given resources is only secondary alongside functional correctness. Therefore, we are presenting in this paper a framework for stencil code applications which immediately delivers the best architecture regarding prominent resource criteria. An analytical model is used to find an optimized parameter set (degree of parallelism, usage of buffers, etc.) for a highly flexible FPGA implementation. A graphical tool allows to further evaluate the effects of certain parameters. Our results show, that we are able to create an optimized hardware architecture for this application domain.\",\"PeriodicalId\":346802,\"journal\":{\"name\":\"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)\",\"volume\":\"141 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-07-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SAMOS.2015.7363664\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SAMOS.2015.7363664","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Framework for parameter analysis of FPGA-based image processing architectures
Image processing algorithms which only work on a local neighbourhood are nearly used in every image processing application. Very often several iterations are performed on a fixed neighbourhood which leads to the description of stencil codes. A promising approach in embedded systems is to use the massively parallel computation power of an FPGA for this kind of algorithms. This not only speeds up processing time, if the FPGA is directly placed inside the image acquisition unit forming a smart camera, but also reduces or even eliminates the PC based hardware which saves space and power. However, most designers begin from scratch when they have to implement stencil computations into smart cameras. This leads to a not fully utilized FPGA because the most efficient usage of the given resources is only secondary alongside functional correctness. Therefore, we are presenting in this paper a framework for stencil code applications which immediately delivers the best architecture regarding prominent resource criteria. An analytical model is used to find an optimized parameter set (degree of parallelism, usage of buffers, etc.) for a highly flexible FPGA implementation. A graphical tool allows to further evaluate the effects of certain parameters. Our results show, that we are able to create an optimized hardware architecture for this application domain.