Cristian Sestito, F. Spagnolo, P. Corsonello, S. Perri
{"title":"An Efficient Convolution Engine based on the À-trous Spatial Pyramid Pooling","authors":"Cristian Sestito, F. Spagnolo, P. Corsonello, S. Perri","doi":"10.1109/ASAP49362.2020.00022","DOIUrl":null,"url":null,"abstract":"This paper presents an efficient hardware architecture able to perform 2D dilated convolutions and suitable for the integration within modern heterogeneous embedded systems targeting semantic image segmentation. The proposed design supports multiple dilation rates. Moreover, it uses limited amounts of resources even when large convolution windows are processed. As a case study, the novel circuit has been integrated within a Xilinx Zynq-7000 FPSoC device to accelerate a state-of-the-art CNN model for medical images segmentation. Obtained results demonstrate that higher computational capabilities, reduced resources utilization and lower power consumption are achieved with respect to the competitors existing in literature.","PeriodicalId":375691,"journal":{"name":"2020 IEEE 31st International Conference on Application-specific Systems, Architectures and Processors (ASAP)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 31st International Conference on Application-specific Systems, Architectures and Processors (ASAP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASAP49362.2020.00022","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
This paper presents an efficient hardware architecture able to perform 2D dilated convolutions and suitable for the integration within modern heterogeneous embedded systems targeting semantic image segmentation. The proposed design supports multiple dilation rates. Moreover, it uses limited amounts of resources even when large convolution windows are processed. As a case study, the novel circuit has been integrated within a Xilinx Zynq-7000 FPSoC device to accelerate a state-of-the-art CNN model for medical images segmentation. Obtained results demonstrate that higher computational capabilities, reduced resources utilization and lower power consumption are achieved with respect to the competitors existing in literature.