ADSTrack:用于视觉跟踪的自适应动态采样

IF 5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Zhenhai Wang, Lutao Yuan, Ying Ren, Sen Zhang, Hongyu Tian
{"title":"ADSTrack:用于视觉跟踪的自适应动态采样","authors":"Zhenhai Wang, Lutao Yuan, Ying Ren, Sen Zhang, Hongyu Tian","doi":"10.1007/s40747-024-01672-0","DOIUrl":null,"url":null,"abstract":"<p>The most common method for visual object tracking involves feeding an image pair comprising a template image and search region into a tracker. The tracker uses a backbone to process the information in the image pair. In pure Transformer-based frameworks, redundant information in image pairs exists throughout the tracking process and the corresponding negative tokens consume the same computational resources as the positive tokens while degrading the performance of the tracker. Therefore, we propose to solve this problem using an adaptive dynamic sampling strategy in a pure Transformer-based tracker, known as ADSTrack. ADSTrack progressively reduces irrelevant, redundant negative tokens in the search region that are not related to the tracked objectand the effect of noise generated by these tokens. The adaptive dynamic sampling strategy enhances the performance of the tracker by scoring and adaptive sampling of important tokens, and the number of tokens sampled varies according to the input image. Moreover, the adaptive dynamic sampling strategy is a parameterless token sampling strategy that does not use additional parameters. We add several extra tokens as auxiliary tokens to the backbone to further optimize the feature map. We extensively evaluate ADSTrack, achieving satisfactory results for seven test sets, including UAV123 and LaSOT.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"13 1","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ADSTrack: adaptive dynamic sampling for visual tracking\",\"authors\":\"Zhenhai Wang, Lutao Yuan, Ying Ren, Sen Zhang, Hongyu Tian\",\"doi\":\"10.1007/s40747-024-01672-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The most common method for visual object tracking involves feeding an image pair comprising a template image and search region into a tracker. The tracker uses a backbone to process the information in the image pair. In pure Transformer-based frameworks, redundant information in image pairs exists throughout the tracking process and the corresponding negative tokens consume the same computational resources as the positive tokens while degrading the performance of the tracker. Therefore, we propose to solve this problem using an adaptive dynamic sampling strategy in a pure Transformer-based tracker, known as ADSTrack. ADSTrack progressively reduces irrelevant, redundant negative tokens in the search region that are not related to the tracked objectand the effect of noise generated by these tokens. The adaptive dynamic sampling strategy enhances the performance of the tracker by scoring and adaptive sampling of important tokens, and the number of tokens sampled varies according to the input image. Moreover, the adaptive dynamic sampling strategy is a parameterless token sampling strategy that does not use additional parameters. We add several extra tokens as auxiliary tokens to the backbone to further optimize the feature map. We extensively evaluate ADSTrack, achieving satisfactory results for seven test sets, including UAV123 and LaSOT.</p>\",\"PeriodicalId\":10524,\"journal\":{\"name\":\"Complex & Intelligent Systems\",\"volume\":\"13 1\",\"pages\":\"\"},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2024-12-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Complex & Intelligent Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s40747-024-01672-0\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex & Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s40747-024-01672-0","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

最常见的视觉物体跟踪方法是将由模板图像和搜索区域组成的图像对输入跟踪器。跟踪器使用主干处理图像对中的信息。在纯粹基于变换器的框架中,图像对中的冗余信息存在于整个跟踪过程中,相应的负标记会消耗与正标记相同的计算资源,同时降低跟踪器的性能。因此,我们建议在基于纯变换器的跟踪器中采用自适应动态采样策略来解决这一问题,即 ADSTrack。ADSTrack 可以逐步减少搜索区域中与被跟踪物体无关的冗余负标记以及这些标记产生的噪声影响。自适应动态采样策略通过对重要标记进行评分和自适应采样来提高跟踪器的性能,采样标记的数量根据输入图像的不同而变化。此外,自适应动态采样策略是一种无参数标记采样策略,不使用额外的参数。我们在骨干网中添加了几个额外的标记作为辅助标记,以进一步优化特征图。我们对 ADSTrack 进行了广泛的评估,在包括 UAV123 和 LaSOT 在内的七个测试集中取得了令人满意的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
ADSTrack: adaptive dynamic sampling for visual tracking

The most common method for visual object tracking involves feeding an image pair comprising a template image and search region into a tracker. The tracker uses a backbone to process the information in the image pair. In pure Transformer-based frameworks, redundant information in image pairs exists throughout the tracking process and the corresponding negative tokens consume the same computational resources as the positive tokens while degrading the performance of the tracker. Therefore, we propose to solve this problem using an adaptive dynamic sampling strategy in a pure Transformer-based tracker, known as ADSTrack. ADSTrack progressively reduces irrelevant, redundant negative tokens in the search region that are not related to the tracked objectand the effect of noise generated by these tokens. The adaptive dynamic sampling strategy enhances the performance of the tracker by scoring and adaptive sampling of important tokens, and the number of tokens sampled varies according to the input image. Moreover, the adaptive dynamic sampling strategy is a parameterless token sampling strategy that does not use additional parameters. We add several extra tokens as auxiliary tokens to the backbone to further optimize the feature map. We extensively evaluate ADSTrack, achieving satisfactory results for seven test sets, including UAV123 and LaSOT.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Complex & Intelligent Systems
Complex & Intelligent Systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-
CiteScore
9.60
自引率
10.30%
发文量
297
期刊介绍: Complex & Intelligent Systems aims to provide a forum for presenting and discussing novel approaches, tools and techniques meant for attaining a cross-fertilization between the broad fields of complex systems, computational simulation, and intelligent analytics and visualization. The transdisciplinary research that the journal focuses on will expand the boundaries of our understanding by investigating the principles and processes that underlie many of the most profound problems facing society today.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信