将 DCN 和 BBAV 融合用于遥感图像目标检测

International Journal of Cognitive Informatics and Natural Intelligence Pub Date : 2024-01-07 DOI:10.4018/ijcini.335496

Honghuan Chen, Keming Wang

{"title":"将 DCN 和 BBAV 融合用于遥感图像目标检测","authors":"Honghuan Chen, Keming Wang","doi":"10.4018/ijcini.335496","DOIUrl":null,"url":null,"abstract":"At the oriented object detection in aerial remote sensing images, the perceptual field boundaries of ordinary convolutional kernels are often not parallel to the boundaries of the objects to be detected, affecting the model precision. Therefore, an object detection model (DCN-BBAV) that fuses deformable convolution networks (DCNs) and box boundary-aware vectors (BBAVs) is proposed. Firstly, a BBAV is used as the baseline, replacing the normal convolution kernels in the backbone network with deformable convolution kernels. Then, the spatial attention module (SAM) and channel attention mechanism (CAM) are used to enhance the feature extraction ability for a DCN. Finally, the dot product of the included angles of four adjacent vectors are added to the loss function of the rotation frame parameter, improving the regression precision of the boundary vector. The DCN-BBAV model demonstrates notable performance with a 77.30% mean average precision (mAP) on the DOTA dataset. Additionally, it outperforms other advanced rotating frame object detection methods, achieving impressive results of 90.52% mAP on VOC07 and 96.67% mAP on VOC12 for HRSC2016.","PeriodicalId":509295,"journal":{"name":"International Journal of Cognitive Informatics and Natural Intelligence","volume":"16 6","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fusing DCN and BBAV for Remote Sensing Image Object Detection\",\"authors\":\"Honghuan Chen, Keming Wang\",\"doi\":\"10.4018/ijcini.335496\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"At the oriented object detection in aerial remote sensing images, the perceptual field boundaries of ordinary convolutional kernels are often not parallel to the boundaries of the objects to be detected, affecting the model precision. Therefore, an object detection model (DCN-BBAV) that fuses deformable convolution networks (DCNs) and box boundary-aware vectors (BBAVs) is proposed. Firstly, a BBAV is used as the baseline, replacing the normal convolution kernels in the backbone network with deformable convolution kernels. Then, the spatial attention module (SAM) and channel attention mechanism (CAM) are used to enhance the feature extraction ability for a DCN. Finally, the dot product of the included angles of four adjacent vectors are added to the loss function of the rotation frame parameter, improving the regression precision of the boundary vector. The DCN-BBAV model demonstrates notable performance with a 77.30% mean average precision (mAP) on the DOTA dataset. Additionally, it outperforms other advanced rotating frame object detection methods, achieving impressive results of 90.52% mAP on VOC07 and 96.67% mAP on VOC12 for HRSC2016.\",\"PeriodicalId\":509295,\"journal\":{\"name\":\"International Journal of Cognitive Informatics and Natural Intelligence\",\"volume\":\"16 6\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Cognitive Informatics and Natural Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4018/ijcini.335496\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Cognitive Informatics and Natural Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/ijcini.335496","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在航空遥感图像的定向物体检测中，普通卷积核的感知场边界往往与待检测物体的边界不平行，影响了模型的精度。因此，我们提出了一种融合了可变形卷积网络（DCN）和盒边界感知向量（BBAV）的物体检测模型（DCN-BBAV）。首先，以 BBAV 为基线，用可变形卷积核取代主干网络中的普通卷积核。然后，使用空间注意模块（SAM）和通道注意机制（CAM）来增强 DCN 的特征提取能力。最后，在旋转框架参数的损失函数中加入了四个相邻向量的包含角的点积，从而提高了边界向量的回归精度。DCN-BBAV 模型在 DOTA 数据集上的平均精度 (mAP) 为 77.30%，表现出了显著的性能。此外，它还优于其他先进的旋转框架物体检测方法，在 HRSC2016 的 VOC07 和 VOC12 数据集上分别取得了 90.52% 和 96.67% 的 mAP 的骄人成绩。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Fusing DCN and BBAV for Remote Sensing Image Object Detection

At the oriented object detection in aerial remote sensing images, the perceptual field boundaries of ordinary convolutional kernels are often not parallel to the boundaries of the objects to be detected, affecting the model precision. Therefore, an object detection model (DCN-BBAV) that fuses deformable convolution networks (DCNs) and box boundary-aware vectors (BBAVs) is proposed. Firstly, a BBAV is used as the baseline, replacing the normal convolution kernels in the backbone network with deformable convolution kernels. Then, the spatial attention module (SAM) and channel attention mechanism (CAM) are used to enhance the feature extraction ability for a DCN. Finally, the dot product of the included angles of four adjacent vectors are added to the loss function of the rotation frame parameter, improving the regression precision of the boundary vector. The DCN-BBAV model demonstrates notable performance with a 77.30% mean average precision (mAP) on the DOTA dataset. Additionally, it outperforms other advanced rotating frame object detection methods, achieving impressive results of 90.52% mAP on VOC07 and 96.67% mAP on VOC12 for HRSC2016.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Cognitive Informatics and Natural Intelligence

自引率

0.00%

发文量