R. Takasu, Yoichi Tomioka, Takashi Aoki, H. Kitazawa
{"title":"An FPGA Implementation of Multi-stream Tracking Hardware using 2D SIMD Array (Abstract Only)","authors":"R. Takasu, Yoichi Tomioka, Takashi Aoki, H. Kitazawa","doi":"10.1145/2684746.2689119","DOIUrl":null,"url":null,"abstract":"Worldwide, many surveillance systems are in operation for crime deterrence purposes. An effective system should be characterized by requiring low-power consumption, a small storage capacity, and little human effort. Multi-stream tracking on field programmable gate array (FPGA) is important for such surveillance systems. In this paper, we propose multi-stream tracking hardware that can extract moving objects and their motion vectors from a multi-stream received from 64 cameras in real time. The key technology for multi-stream processing is as follows. (1) In order to avoid maintaining the background, we apply a frame difference method. Moreover, the flows of object are calculated by block matching. The flows are effective for analyzing human motion. (2) In order to avoid a bus bottleneck and memory contention in the communication between processing elements (PEs), synchronous shift data transfer (SSDT), which transfers data in the same direction for all PEs, is applied. In this paper, an extended SSDT is proposed for communication between PEs when multi-blocks are processed in one PE. (3) C++ based integrated control code development tool is shown. Control code written in C++ language can easily be assembled and verified by the tool. We implemented the proposed hardware on a Stratix V 5SGXEA7K2F40C2N device. The operating frequency is 50 MHz and the average number of clocks for processing a set of four frames of QVGA images is 394k clocks. The proposed hardware achieved 520 fps, and can process multi-stream video from 64 cameras. The execution time on 3.4 GHz Core i7-3770 CPU was 8.4 fps. Therefore, the proposed hardware was about 62 times faster than that CPU.","PeriodicalId":388546,"journal":{"name":"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2684746.2689119","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Worldwide, many surveillance systems are in operation for crime deterrence purposes. An effective system should be characterized by requiring low-power consumption, a small storage capacity, and little human effort. Multi-stream tracking on field programmable gate array (FPGA) is important for such surveillance systems. In this paper, we propose multi-stream tracking hardware that can extract moving objects and their motion vectors from a multi-stream received from 64 cameras in real time. The key technology for multi-stream processing is as follows. (1) In order to avoid maintaining the background, we apply a frame difference method. Moreover, the flows of object are calculated by block matching. The flows are effective for analyzing human motion. (2) In order to avoid a bus bottleneck and memory contention in the communication between processing elements (PEs), synchronous shift data transfer (SSDT), which transfers data in the same direction for all PEs, is applied. In this paper, an extended SSDT is proposed for communication between PEs when multi-blocks are processed in one PE. (3) C++ based integrated control code development tool is shown. Control code written in C++ language can easily be assembled and verified by the tool. We implemented the proposed hardware on a Stratix V 5SGXEA7K2F40C2N device. The operating frequency is 50 MHz and the average number of clocks for processing a set of four frames of QVGA images is 394k clocks. The proposed hardware achieved 520 fps, and can process multi-stream video from 64 cameras. The execution time on 3.4 GHz Core i7-3770 CPU was 8.4 fps. Therefore, the proposed hardware was about 62 times faster than that CPU.