Jing Huang, Zhenxue Chen, Luna Sun, Tian Liang, Lei Cai
{"title":"SKNetV2:改进的选择性内核网络用于对象检测","authors":"Jing Huang, Zhenxue Chen, Luna Sun, Tian Liang, Lei Cai","doi":"10.1109/ICAIoT57170.2022.10121885","DOIUrl":null,"url":null,"abstract":"The main task of object detection is to simultaneously detect all objects in an image. Detecting objects of different scales often places conflicting demands on neurons’ receptive fields, therefore, a single receptive field in each convolutional layer cannot effectively solve the problem of scale variation. In this paper, we propose SKNetV2, an improved version of Selective Kernel Networks. The realization of Selective Kernel (SK) convolution is an adaptive fusion of convolutional branches with different kernel sizes under the guidance of attention. The SK convolution considers only channel attention and ignores spatial attention, which is equally important, so the receptive fields of the neurons are still spatially fixed for the same input. We propose an SKv2 convolution that simultaneously applies spatial attention and channel attention to fuse branches of different kernel sizes and then unifies the two attention results. The cooperation of these two attention mechanisms achieves fully selectable receptive fields. We derive SKNetV2 from SKNet by replacing the SK building block with a proposed SKv2 block. We demonstrate the effectiveness of SKNetV2 through extensive experiments on the challenging MS COCO dataset. Without bells and whistles, we achieve an AP of 45.5%, which surpasses the most recently proposed detectors.","PeriodicalId":297735,"journal":{"name":"2022 International Conference on Artificial Intelligence of Things (ICAIoT)","volume":"146 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SKNetV2: Improved Selective Kernel Networks for Object Detection\",\"authors\":\"Jing Huang, Zhenxue Chen, Luna Sun, Tian Liang, Lei Cai\",\"doi\":\"10.1109/ICAIoT57170.2022.10121885\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The main task of object detection is to simultaneously detect all objects in an image. Detecting objects of different scales often places conflicting demands on neurons’ receptive fields, therefore, a single receptive field in each convolutional layer cannot effectively solve the problem of scale variation. In this paper, we propose SKNetV2, an improved version of Selective Kernel Networks. The realization of Selective Kernel (SK) convolution is an adaptive fusion of convolutional branches with different kernel sizes under the guidance of attention. The SK convolution considers only channel attention and ignores spatial attention, which is equally important, so the receptive fields of the neurons are still spatially fixed for the same input. We propose an SKv2 convolution that simultaneously applies spatial attention and channel attention to fuse branches of different kernel sizes and then unifies the two attention results. The cooperation of these two attention mechanisms achieves fully selectable receptive fields. We derive SKNetV2 from SKNet by replacing the SK building block with a proposed SKv2 block. We demonstrate the effectiveness of SKNetV2 through extensive experiments on the challenging MS COCO dataset. Without bells and whistles, we achieve an AP of 45.5%, which surpasses the most recently proposed detectors.\",\"PeriodicalId\":297735,\"journal\":{\"name\":\"2022 International Conference on Artificial Intelligence of Things (ICAIoT)\",\"volume\":\"146 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Artificial Intelligence of Things (ICAIoT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAIoT57170.2022.10121885\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Artificial Intelligence of Things (ICAIoT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAIoT57170.2022.10121885","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
SKNetV2: Improved Selective Kernel Networks for Object Detection
The main task of object detection is to simultaneously detect all objects in an image. Detecting objects of different scales often places conflicting demands on neurons’ receptive fields, therefore, a single receptive field in each convolutional layer cannot effectively solve the problem of scale variation. In this paper, we propose SKNetV2, an improved version of Selective Kernel Networks. The realization of Selective Kernel (SK) convolution is an adaptive fusion of convolutional branches with different kernel sizes under the guidance of attention. The SK convolution considers only channel attention and ignores spatial attention, which is equally important, so the receptive fields of the neurons are still spatially fixed for the same input. We propose an SKv2 convolution that simultaneously applies spatial attention and channel attention to fuse branches of different kernel sizes and then unifies the two attention results. The cooperation of these two attention mechanisms achieves fully selectable receptive fields. We derive SKNetV2 from SKNet by replacing the SK building block with a proposed SKv2 block. We demonstrate the effectiveness of SKNetV2 through extensive experiments on the challenging MS COCO dataset. Without bells and whistles, we achieve an AP of 45.5%, which surpasses the most recently proposed detectors.