{"title":"UAT:Unsupervised object tracking based on graph attention information embedding","authors":"Lixin Wei , Rongzhe Zhu , Ziyu Hu , Zeyu Xi","doi":"10.1016/j.jvcir.2024.104283","DOIUrl":null,"url":null,"abstract":"<div><div>An excellent unsupervised tracker includes a powerful base tracker and an effective unsupervised tracking strategy. However, most base trackers lack internal feature representations for information embedding processes. Most unsupervised trackers are not robust enough in complex environments and lack an effective template update strategy. We propose an unsupervised object tracking based on graph attention information embedding (UAT) to solve the above problems. UAT combines graph attention mechanism with multi-scale features to construct a multi-scale graph attention module (MGA). MGA module dynamically and efficiently completes the information embedding between the template branch and the search area branch. The response map obtained by fusing the feature maps of the two branches is more informative about the location of the target. An attention based information reinforcement update module (RUM) improves the robustness of the tracker. RUM enhances the representation of the feature map in both the spatial dimension and the channel dimension. Template features are also updated indirectly through information transfer between the two branches. RUM suppresses background interference and improves network perception during tracking. Experiments on challenging benchmarks such as VOT2018, VOT2019, TrackingNet, OTB100, LaSOT and UAV123 demonstrate that the proposed UAT achieves state-of-the-art performance in unsupervised trackers.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"104 ","pages":"Article 104283"},"PeriodicalIF":2.6000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Visual Communication and Image Representation","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1047320324002396","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
An excellent unsupervised tracker includes a powerful base tracker and an effective unsupervised tracking strategy. However, most base trackers lack internal feature representations for information embedding processes. Most unsupervised trackers are not robust enough in complex environments and lack an effective template update strategy. We propose an unsupervised object tracking based on graph attention information embedding (UAT) to solve the above problems. UAT combines graph attention mechanism with multi-scale features to construct a multi-scale graph attention module (MGA). MGA module dynamically and efficiently completes the information embedding between the template branch and the search area branch. The response map obtained by fusing the feature maps of the two branches is more informative about the location of the target. An attention based information reinforcement update module (RUM) improves the robustness of the tracker. RUM enhances the representation of the feature map in both the spatial dimension and the channel dimension. Template features are also updated indirectly through information transfer between the two branches. RUM suppresses background interference and improves network perception during tracking. Experiments on challenging benchmarks such as VOT2018, VOT2019, TrackingNet, OTB100, LaSOT and UAV123 demonstrate that the proposed UAT achieves state-of-the-art performance in unsupervised trackers.
期刊介绍:
The Journal of Visual Communication and Image Representation publishes papers on state-of-the-art visual communication and image representation, with emphasis on novel technologies and theoretical work in this multidisciplinary area of pure and applied research. The field of visual communication and image representation is considered in its broadest sense and covers both digital and analog aspects as well as processing and communication in biological visual systems.