SiamMAN: Siamese Multi-Phase Aware Network for Real-Time Unmanned Aerial Vehicle Tracking

IF 4.4 2区 地球科学 Q1 REMOTE SENSING
Drones Pub Date : 2023-12-13 DOI:10.3390/drones7120707
Faxue Liu, Xuan Wang, Qiqi Chen, Jinghong Liu, Chenglong Liu
{"title":"SiamMAN: Siamese Multi-Phase Aware Network for Real-Time Unmanned Aerial Vehicle Tracking","authors":"Faxue Liu, Xuan Wang, Qiqi Chen, Jinghong Liu, Chenglong Liu","doi":"10.3390/drones7120707","DOIUrl":null,"url":null,"abstract":"In this paper, we address aerial tracking tasks by designing multi-phase aware networks to obtain rich long-range dependencies. For aerial tracking tasks, the existing methods are prone to tracking drift in scenarios with high demand for multi-layer long-range feature dependencies such as viewpoint change caused by the characteristics of the UAV shooting perspective, low resolution, etc. In contrast to the previous works that only used multi-scale feature fusion to obtain contextual information, we designed a new architecture to adapt the characteristics of different levels of features in challenging scenarios to adaptively integrate regional features and the corresponding global dependencies information. Specifically, for the proposed tracker (SiamMAN), we first propose a two-stage aware neck (TAN), where first a cascaded splitting encoder (CSE) is used to obtain the distributed long-range relevance among the sub-branches by the splitting of feature channels, and then a multi-level contextual decoder (MCD) is used to achieve further global dependency fusion. Finally, we design the response map context encoder (RCE) utilizing long-range contextual information in backpropagation to accomplish pixel-level updating for the deeper features and better balance the semantic and spatial information. Several experiments on well-known tracking benchmarks illustrate that the proposed method outperforms SOTA trackers, which results from the effective utilization of the proposed multi-phase aware network for different levels of features.","PeriodicalId":36448,"journal":{"name":"Drones","volume":"55 7","pages":""},"PeriodicalIF":4.4000,"publicationDate":"2023-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Drones","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.3390/drones7120707","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"REMOTE SENSING","Score":null,"Total":0}
引用次数: 0

Abstract

In this paper, we address aerial tracking tasks by designing multi-phase aware networks to obtain rich long-range dependencies. For aerial tracking tasks, the existing methods are prone to tracking drift in scenarios with high demand for multi-layer long-range feature dependencies such as viewpoint change caused by the characteristics of the UAV shooting perspective, low resolution, etc. In contrast to the previous works that only used multi-scale feature fusion to obtain contextual information, we designed a new architecture to adapt the characteristics of different levels of features in challenging scenarios to adaptively integrate regional features and the corresponding global dependencies information. Specifically, for the proposed tracker (SiamMAN), we first propose a two-stage aware neck (TAN), where first a cascaded splitting encoder (CSE) is used to obtain the distributed long-range relevance among the sub-branches by the splitting of feature channels, and then a multi-level contextual decoder (MCD) is used to achieve further global dependency fusion. Finally, we design the response map context encoder (RCE) utilizing long-range contextual information in backpropagation to accomplish pixel-level updating for the deeper features and better balance the semantic and spatial information. Several experiments on well-known tracking benchmarks illustrate that the proposed method outperforms SOTA trackers, which results from the effective utilization of the proposed multi-phase aware network for different levels of features.
SiamMAN:用于无人机实时跟踪的暹罗多相感知网络
本文通过设计多相感知网络来获取丰富的长距离依赖关系,从而解决航拍跟踪任务。对于航拍跟踪任务,现有方法在对多层长距离特征依赖性要求较高的场景中容易出现跟踪漂移,如无人机拍摄视角特征引起的视点变化、低分辨率等。与以往仅利用多尺度特征融合获取上下文信息的研究相比,我们设计了一种新的架构,以适应挑战性场景中不同层次特征的特点,自适应地融合区域特征和相应的全局依赖信息。具体来说,对于所提出的跟踪器(SiamMAN),我们首先提出了两级感知颈(TAN),其中首先使用级联分割编码器(CSE)通过特征通道的分割获得子分支间的分布式远距离相关性,然后使用多级上下文解码器(MCD)实现进一步的全局依赖性融合。最后,我们设计了响应图上下文编码器(RCE),利用反向传播中的长距离上下文信息来完成深层特征的像素级更新,从而更好地平衡语义和空间信息。在一些著名的跟踪基准上进行的实验表明,所提出的方法优于 SOTA 跟踪器,这得益于所提出的多阶段感知网络对不同层次特征的有效利用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Drones
Drones Engineering-Aerospace Engineering
CiteScore
5.60
自引率
18.80%
发文量
331
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信