Local Fine-Grained Visual Tracking

IF 8.4 1区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Jingjing Wu;Yifan Sun;Richang Hong
{"title":"Local Fine-Grained Visual Tracking","authors":"Jingjing Wu;Yifan Sun;Richang Hong","doi":"10.1109/TMM.2025.3535329","DOIUrl":null,"url":null,"abstract":"This paper introduces a novel local fine-grained visual tracking task, aiming to precisely locate arbitrary local parts of objects. This task is motivated by our observation that in many realistic scenarios, the user demands to track a local part instead of a holistic object. However, the absence of an evaluation dataset and the distinctive characteristics of local fine-grained targets present extra challenges in conducting this research. To tackle these issues, first, this paper constructs a local fine-grained tracking (LFT) dataset to evaluate the tracking performance for local fine-grained targets. Second, this paper designs a cutting-edge solution to handle the challenges posed by properties of local objects, including ambiguity and high-proportion backgrounds. It consists of a hierarchical adaptive mask mechanism and foreground-background differentiated learning. The former adaptively searches for and masks ambiguity, which drives the network to concentrate on the local target instead of the holistic objects. The latter is constructed to distinguish foreground and background in an unsupervised manner, which is beneficial to mitigate the impacts of high-proportion backgrounds. Extensive analytic experiments are performed to verify the effectiveness of each submodule in the proposed fine-grained tracker.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"27 ","pages":"3426-3436"},"PeriodicalIF":8.4000,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multimedia","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10855545/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

This paper introduces a novel local fine-grained visual tracking task, aiming to precisely locate arbitrary local parts of objects. This task is motivated by our observation that in many realistic scenarios, the user demands to track a local part instead of a holistic object. However, the absence of an evaluation dataset and the distinctive characteristics of local fine-grained targets present extra challenges in conducting this research. To tackle these issues, first, this paper constructs a local fine-grained tracking (LFT) dataset to evaluate the tracking performance for local fine-grained targets. Second, this paper designs a cutting-edge solution to handle the challenges posed by properties of local objects, including ambiguity and high-proportion backgrounds. It consists of a hierarchical adaptive mask mechanism and foreground-background differentiated learning. The former adaptively searches for and masks ambiguity, which drives the network to concentrate on the local target instead of the holistic objects. The latter is constructed to distinguish foreground and background in an unsupervised manner, which is beneficial to mitigate the impacts of high-proportion backgrounds. Extensive analytic experiments are performed to verify the effectiveness of each submodule in the proposed fine-grained tracker.
局部细粒度视觉跟踪
本文提出了一种新的局部细粒度视觉跟踪任务,旨在精确定位目标的任意局部部分。这项任务的动机是我们观察到,在许多现实场景中,用户要求跟踪局部部分而不是整体对象。然而,缺乏评估数据集和局部细粒度目标的独特特征为开展这项研究带来了额外的挑战。为了解决这些问题,本文首先构建了一个局部细粒度跟踪(LFT)数据集来评估局部细粒度目标的跟踪性能。其次,本文设计了一种前沿的解决方案来处理局部目标的属性所带来的挑战,包括模糊性和高比例背景。它由分层自适应掩模机制和前景-背景差异化学习组成。前者自适应地搜索和掩盖模糊性,使网络集中于局部目标而不是整体对象。后者的构建是为了以无监督的方式区分前景和背景,这有利于减轻高比例背景的影响。为了验证所提出的细粒度跟踪器中每个子模块的有效性,进行了大量的分析实验。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Multimedia
IEEE Transactions on Multimedia 工程技术-电信学
CiteScore
11.70
自引率
11.00%
发文量
576
审稿时长
5.5 months
期刊介绍: The IEEE Transactions on Multimedia delves into diverse aspects of multimedia technology and applications, covering circuits, networking, signal processing, systems, software, and systems integration. The scope aligns with the Fields of Interest of the sponsors, ensuring a comprehensive exploration of research in multimedia.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信