{"title":"新型高效转置卷积技术在无人机图像火焰分割中的应用","authors":"F. A. Hossain, Youmin Zhang","doi":"10.1109/IAI53119.2021.9619442","DOIUrl":null,"url":null,"abstract":"Although Fully Convolutional Networks (FCNs) have been proven to be a very powerful tool in deep learning-based image segmentation, they are still too computationally expensive to be incorporated into mobile platforms such as Unmanned Aerial Vehicles (UAVs) for real-time performance. While significant efforts have been made to make the encoder side of a FCN more efficient, the decoder side, which involves upsampling the feature maps, is still overlooked in comparison. This paper proposes two new efficient upsampling techniques, “Reversed Depthwise Separable Transposed Convolution (RDSTC)” and “Compression-Expansion Transposed Convolution (CETC)”. U-Net architecture and UAV-captured forest pile fire images have been used to evaluate the performance of these new efficient upsampling techniques. RDSTC and CETC achieve Dice scores of 0.8815 and 0.8832 respectively, outperforming commonly used bilinear interpolation and original transposed convolution, while significantly reducing the number of upsampling computations. The results of this paper demonstrate that upsampling operation in a deep learning architecture can be made more efficient without degradation in performance.","PeriodicalId":106675,"journal":{"name":"2021 3rd International Conference on Industrial Artificial Intelligence (IAI)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Development of New Efficient Transposed Convolution Techniques for Flame Segmentation from UAV-captured Images\",\"authors\":\"F. A. Hossain, Youmin Zhang\",\"doi\":\"10.1109/IAI53119.2021.9619442\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Although Fully Convolutional Networks (FCNs) have been proven to be a very powerful tool in deep learning-based image segmentation, they are still too computationally expensive to be incorporated into mobile platforms such as Unmanned Aerial Vehicles (UAVs) for real-time performance. While significant efforts have been made to make the encoder side of a FCN more efficient, the decoder side, which involves upsampling the feature maps, is still overlooked in comparison. This paper proposes two new efficient upsampling techniques, “Reversed Depthwise Separable Transposed Convolution (RDSTC)” and “Compression-Expansion Transposed Convolution (CETC)”. U-Net architecture and UAV-captured forest pile fire images have been used to evaluate the performance of these new efficient upsampling techniques. RDSTC and CETC achieve Dice scores of 0.8815 and 0.8832 respectively, outperforming commonly used bilinear interpolation and original transposed convolution, while significantly reducing the number of upsampling computations. The results of this paper demonstrate that upsampling operation in a deep learning architecture can be made more efficient without degradation in performance.\",\"PeriodicalId\":106675,\"journal\":{\"name\":\"2021 3rd International Conference on Industrial Artificial Intelligence (IAI)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 3rd International Conference on Industrial Artificial Intelligence (IAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IAI53119.2021.9619442\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 3rd International Conference on Industrial Artificial Intelligence (IAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IAI53119.2021.9619442","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Development of New Efficient Transposed Convolution Techniques for Flame Segmentation from UAV-captured Images
Although Fully Convolutional Networks (FCNs) have been proven to be a very powerful tool in deep learning-based image segmentation, they are still too computationally expensive to be incorporated into mobile platforms such as Unmanned Aerial Vehicles (UAVs) for real-time performance. While significant efforts have been made to make the encoder side of a FCN more efficient, the decoder side, which involves upsampling the feature maps, is still overlooked in comparison. This paper proposes two new efficient upsampling techniques, “Reversed Depthwise Separable Transposed Convolution (RDSTC)” and “Compression-Expansion Transposed Convolution (CETC)”. U-Net architecture and UAV-captured forest pile fire images have been used to evaluate the performance of these new efficient upsampling techniques. RDSTC and CETC achieve Dice scores of 0.8815 and 0.8832 respectively, outperforming commonly used bilinear interpolation and original transposed convolution, while significantly reducing the number of upsampling computations. The results of this paper demonstrate that upsampling operation in a deep learning architecture can be made more efficient without degradation in performance.