具有多重关注机制的连体追踪网络

IF 2.6 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Yuzhuo Xu, Ting Li, Bing Zhu, Fasheng Wang, Fuming Sun
{"title":"具有多重关注机制的连体追踪网络","authors":"Yuzhuo Xu, Ting Li, Bing Zhu, Fasheng Wang, Fuming Sun","doi":"10.1007/s11063-024-11670-5","DOIUrl":null,"url":null,"abstract":"<p>Object trackers based on Siamese networks view tracking as a similarity-matching process. However, the correlation operation operates as a local linear matching process, limiting the tracker’s ability to capture the intricate nonlinear relationship between the template and search region branches. Moreover, most trackers don’t update the template and often use the first frame of an image as the initial template, which will easily lead to poor tracking performance of the algorithm when facing instances of deformation, scale variation, and occlusion of the tracking target. To this end, we propose a Simases tracking network with a multi-attention mechanism, including a template branch and a search branch. To adapt to changes in target appearance, we integrate dynamic templates and multi-attention mechanisms in the template branch to obtain more effective feature representation by fusing the features of initial templates and dynamic templates. To enhance the robustness of the tracking model, we utilize a multi-attention mechanism in the search branch that shares weights with the template branch to obtain multi-scale feature representation by fusing search region features at different scales. In addition, we design a lightweight and simple feature fusion mechanism, in which the Transformer encoder structure is utilized to fuse the information of the template area and search area, and the dynamic template is updated online based on confidence. Experimental results on publicly tracking datasets show that the proposed method achieves competitive results compared to several state-of-the-art trackers.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"41 1","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Siamese Tracking Network with Multi-attention Mechanism\",\"authors\":\"Yuzhuo Xu, Ting Li, Bing Zhu, Fasheng Wang, Fuming Sun\",\"doi\":\"10.1007/s11063-024-11670-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Object trackers based on Siamese networks view tracking as a similarity-matching process. However, the correlation operation operates as a local linear matching process, limiting the tracker’s ability to capture the intricate nonlinear relationship between the template and search region branches. Moreover, most trackers don’t update the template and often use the first frame of an image as the initial template, which will easily lead to poor tracking performance of the algorithm when facing instances of deformation, scale variation, and occlusion of the tracking target. To this end, we propose a Simases tracking network with a multi-attention mechanism, including a template branch and a search branch. To adapt to changes in target appearance, we integrate dynamic templates and multi-attention mechanisms in the template branch to obtain more effective feature representation by fusing the features of initial templates and dynamic templates. To enhance the robustness of the tracking model, we utilize a multi-attention mechanism in the search branch that shares weights with the template branch to obtain multi-scale feature representation by fusing search region features at different scales. In addition, we design a lightweight and simple feature fusion mechanism, in which the Transformer encoder structure is utilized to fuse the information of the template area and search area, and the dynamic template is updated online based on confidence. Experimental results on publicly tracking datasets show that the proposed method achieves competitive results compared to several state-of-the-art trackers.</p>\",\"PeriodicalId\":51144,\"journal\":{\"name\":\"Neural Processing Letters\",\"volume\":\"41 1\",\"pages\":\"\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2024-08-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Processing Letters\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s11063-024-11670-5\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Processing Letters","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11063-024-11670-5","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

基于连体网络的物体跟踪器将跟踪视为一个相似性匹配过程。然而,相关操作是一个局部线性匹配过程,限制了跟踪器捕捉模板和搜索区域分支之间错综复杂的非线性关系的能力。此外,大多数跟踪器不会更新模板,通常使用图像的第一帧作为初始模板,这就很容易导致算法在面对跟踪目标的变形、尺度变化和遮挡等情况时跟踪性能不佳。为此,我们提出了一种具有多注意机制的 Simases 跟踪网络,包括模板分支和搜索分支。为了适应目标外观的变化,我们在模板分支中整合了动态模板和多注意机制,通过融合初始模板和动态模板的特征来获得更有效的特征表示。为了增强跟踪模型的鲁棒性,我们在搜索分支中利用与模板分支共享权重的多注意机制,通过融合不同尺度的搜索区域特征来获得多尺度特征表示。此外,我们还设计了一种轻量级的简单特征融合机制,利用变换器编码器结构融合模板区域和搜索区域的信息,并根据置信度在线更新动态模板。在公开跟踪数据集上的实验结果表明,与几种最先进的跟踪器相比,所提出的方法取得了具有竞争力的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Siamese Tracking Network with Multi-attention Mechanism

Siamese Tracking Network with Multi-attention Mechanism

Object trackers based on Siamese networks view tracking as a similarity-matching process. However, the correlation operation operates as a local linear matching process, limiting the tracker’s ability to capture the intricate nonlinear relationship between the template and search region branches. Moreover, most trackers don’t update the template and often use the first frame of an image as the initial template, which will easily lead to poor tracking performance of the algorithm when facing instances of deformation, scale variation, and occlusion of the tracking target. To this end, we propose a Simases tracking network with a multi-attention mechanism, including a template branch and a search branch. To adapt to changes in target appearance, we integrate dynamic templates and multi-attention mechanisms in the template branch to obtain more effective feature representation by fusing the features of initial templates and dynamic templates. To enhance the robustness of the tracking model, we utilize a multi-attention mechanism in the search branch that shares weights with the template branch to obtain multi-scale feature representation by fusing search region features at different scales. In addition, we design a lightweight and simple feature fusion mechanism, in which the Transformer encoder structure is utilized to fuse the information of the template area and search area, and the dynamic template is updated online based on confidence. Experimental results on publicly tracking datasets show that the proposed method achieves competitive results compared to several state-of-the-art trackers.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Neural Processing Letters
Neural Processing Letters 工程技术-计算机:人工智能
CiteScore
4.90
自引率
12.90%
发文量
392
审稿时长
2.8 months
期刊介绍: Neural Processing Letters is an international journal publishing research results and innovative ideas on all aspects of artificial neural networks. Coverage includes theoretical developments, biological models, new formal modes, learning, applications, software and hardware developments, and prospective researches. The journal promotes fast exchange of information in the community of neural network researchers and users. The resurgence of interest in the field of artificial neural networks since the beginning of the 1980s is coupled to tremendous research activity in specialized or multidisciplinary groups. Research, however, is not possible without good communication between people and the exchange of information, especially in a field covering such different areas; fast communication is also a key aspect, and this is the reason for Neural Processing Letters
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信