Stage-by-Stage Adaptive Alignment Mechanism for Object Detection in Aerial Images

IF 2.6 3区工程技术 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Electronics Pub Date : 2024-09-12 DOI:10.3390/electronics13183640

Jiangang Zhu, Donglin Jing, Dapeng Gao

{"title":"Stage-by-Stage Adaptive Alignment Mechanism for Object Detection in Aerial Images","authors":"Jiangang Zhu, Donglin Jing, Dapeng Gao","doi":"10.3390/electronics13183640","DOIUrl":null,"url":null,"abstract":"Object detection in aerial images has had a broader range of applications in the past few years. Unlike the targets in the images of horizontal shooting, targets in aerial photos generally have arbitrary orientation, multi-scale, and a high aspect ratio. Existing methods often employ a classification backbone network to extract translation-equivariant features (TEFs) and utilize many predefined anchors to handle objects with diverse appearance variations. However, they encounter misalignment at three levels, spatial, feature, and task, during different detection stages. In this study, we propose a model called the Staged Adaptive Alignment Detector (SAADet) to solve these challenges. This method utilizes a Spatial Selection Adaptive Network (SSANet) to achieve spatial alignment of the convolution receptive field to the scale of the object by using a convolution sequence with an increasing dilation rate to capture the spatial context information of different ranges and evaluating this information through model dynamic weighting. After correcting the preset horizontal anchor to an oriented anchor, feature alignment is achieved through the alignment convolution guided by oriented anchor to align the backbone features with the object’s orientation. The decoupling of features using the Active Rotating Filter is performed to mitigate inconsistencies due to the sharing of backbone features in regression and classification tasks to accomplish task alignment. The experimental results show that SAADet achieves equilibrium in speed and accuracy on two aerial image datasets, HRSC2016 and UCAS-AOD.","PeriodicalId":11646,"journal":{"name":"Electronics","volume":"58 1","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electronics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.3390/electronics13183640","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Object detection in aerial images has had a broader range of applications in the past few years. Unlike the targets in the images of horizontal shooting, targets in aerial photos generally have arbitrary orientation, multi-scale, and a high aspect ratio. Existing methods often employ a classification backbone network to extract translation-equivariant features (TEFs) and utilize many predefined anchors to handle objects with diverse appearance variations. However, they encounter misalignment at three levels, spatial, feature, and task, during different detection stages. In this study, we propose a model called the Staged Adaptive Alignment Detector (SAADet) to solve these challenges. This method utilizes a Spatial Selection Adaptive Network (SSANet) to achieve spatial alignment of the convolution receptive field to the scale of the object by using a convolution sequence with an increasing dilation rate to capture the spatial context information of different ranges and evaluating this information through model dynamic weighting. After correcting the preset horizontal anchor to an oriented anchor, feature alignment is achieved through the alignment convolution guided by oriented anchor to align the backbone features with the object’s orientation. The decoupling of features using the Active Rotating Filter is performed to mitigate inconsistencies due to the sharing of backbone features in regression and classification tasks to accomplish task alignment. The experimental results show that SAADet achieves equilibrium in speed and accuracy on two aerial image datasets, HRSC2016 and UCAS-AOD.

查看原文本刊更多论文

航空图像中物体检测的逐级自适应对齐机制

航空图像中的目标检测在过去几年中得到了更广泛的应用。与水平拍摄图像中的目标不同，航空照片中的目标通常具有任意方向、多尺度和高宽比等特点。现有的方法通常采用分类骨干网络来提取平移方差特征（TEF），并利用许多预定义的锚点来处理具有不同外观变化的物体。然而，这些方法在不同的检测阶段会遇到空间、特征和任务三个层面的错位。在本研究中，我们提出了一种名为 "分阶段自适应对齐检测器"（SAADet）的模型来解决这些难题。该方法利用空间选择自适应网络（SSANet）来实现卷积感受野的空间对齐，通过使用扩张率不断增加的卷积序列来捕捉不同范围的空间上下文信息，并通过模型动态加权来评估这些信息，从而使卷积感受野与物体的尺度保持一致。将预设的水平锚点修正为定向锚点后，通过定向锚点引导的对齐卷积实现特征对齐，使骨干特征与物体的方向对齐。使用有源旋转滤波器对特征进行解耦，以减少回归和分类任务中因共享骨干特征而产生的不一致性，从而完成任务对齐。实验结果表明，SAADet 在 HRSC2016 和 UCAS-AOD 这两个航空图像数据集上实现了速度和精度的平衡。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Electronics Computer Science-Computer Networks and Communications

CiteScore

1.10

自引率

10.30%

发文量

3515

审稿时长

16.71 days

期刊介绍： Electronics (ISSN 2079-9292; CODEN: ELECGJ) is an international, open access journal on the science of electronics and its applications published quarterly online by MDPI.