{"title":"SiamS3C:利用中心向导回归进行视觉跟踪的空间通道交叉相关技术","authors":"Jianming Zhang, Wentao Chen, Yufan He, Li-Dan Kuang, Arun Kumar Sangaiah","doi":"10.1007/s00530-024-01450-5","DOIUrl":null,"url":null,"abstract":"<p>Visual object tracking can be divided into the object classification and bounding-box regression tasks, but only one sharing correlation map leads to inaccuracy. Siamese trackers compute correlation map by cross-correlation operation with high computational cost, and this operation performed either on channels or in spatial domain results in weak perception of the global information. In addition, some Siamese trackers with a centerness branch ignore the associations between the centerness branch and the bounding-box regression branch. To alleviate these problems, we propose a visual object tracker based on Spatial-Channel Cross-Correlation and Centerness-Guided Regression. Firstly, we propose a spatial-channel cross-correlation module (SC3M) that combines the search region feature and the template feature both on channels and in spatial domain, which suppresses the interference of distractors. As a lightweight module, SC3M can compute dual independent correlation maps inputted to different subnetworks. Secondly, we propose a centerness-guided regression subnetwork consisting of the centerness branch and the bounding-box regression branch. The centerness guides the whole regression subnetwork to enhance the association of two branches and further suppress the low-quality predicted bounding boxes. Thirdly, we have conducted extensive experiments on five challenging benchmarks, including GOT-10k, VOT2018, TrackingNet, OTB100 and UAV123. The results show the excellent performance of our tracker and our tracker achieves real-time requirement at 48.52 fps.</p>","PeriodicalId":51138,"journal":{"name":"Multimedia Systems","volume":"154 1","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SiamS3C: spatial-channel cross-correlation for visual tracking with centerness-guided regression\",\"authors\":\"Jianming Zhang, Wentao Chen, Yufan He, Li-Dan Kuang, Arun Kumar Sangaiah\",\"doi\":\"10.1007/s00530-024-01450-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Visual object tracking can be divided into the object classification and bounding-box regression tasks, but only one sharing correlation map leads to inaccuracy. Siamese trackers compute correlation map by cross-correlation operation with high computational cost, and this operation performed either on channels or in spatial domain results in weak perception of the global information. In addition, some Siamese trackers with a centerness branch ignore the associations between the centerness branch and the bounding-box regression branch. To alleviate these problems, we propose a visual object tracker based on Spatial-Channel Cross-Correlation and Centerness-Guided Regression. Firstly, we propose a spatial-channel cross-correlation module (SC3M) that combines the search region feature and the template feature both on channels and in spatial domain, which suppresses the interference of distractors. As a lightweight module, SC3M can compute dual independent correlation maps inputted to different subnetworks. Secondly, we propose a centerness-guided regression subnetwork consisting of the centerness branch and the bounding-box regression branch. The centerness guides the whole regression subnetwork to enhance the association of two branches and further suppress the low-quality predicted bounding boxes. Thirdly, we have conducted extensive experiments on five challenging benchmarks, including GOT-10k, VOT2018, TrackingNet, OTB100 and UAV123. The results show the excellent performance of our tracker and our tracker achieves real-time requirement at 48.52 fps.</p>\",\"PeriodicalId\":51138,\"journal\":{\"name\":\"Multimedia Systems\",\"volume\":\"154 1\",\"pages\":\"\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2024-08-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Multimedia Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s00530-024-01450-5\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Multimedia Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00530-024-01450-5","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
SiamS3C: spatial-channel cross-correlation for visual tracking with centerness-guided regression
Visual object tracking can be divided into the object classification and bounding-box regression tasks, but only one sharing correlation map leads to inaccuracy. Siamese trackers compute correlation map by cross-correlation operation with high computational cost, and this operation performed either on channels or in spatial domain results in weak perception of the global information. In addition, some Siamese trackers with a centerness branch ignore the associations between the centerness branch and the bounding-box regression branch. To alleviate these problems, we propose a visual object tracker based on Spatial-Channel Cross-Correlation and Centerness-Guided Regression. Firstly, we propose a spatial-channel cross-correlation module (SC3M) that combines the search region feature and the template feature both on channels and in spatial domain, which suppresses the interference of distractors. As a lightweight module, SC3M can compute dual independent correlation maps inputted to different subnetworks. Secondly, we propose a centerness-guided regression subnetwork consisting of the centerness branch and the bounding-box regression branch. The centerness guides the whole regression subnetwork to enhance the association of two branches and further suppress the low-quality predicted bounding boxes. Thirdly, we have conducted extensive experiments on five challenging benchmarks, including GOT-10k, VOT2018, TrackingNet, OTB100 and UAV123. The results show the excellent performance of our tracker and our tracker achieves real-time requirement at 48.52 fps.
期刊介绍:
This journal details innovative research ideas, emerging technologies, state-of-the-art methods and tools in all aspects of multimedia computing, communication, storage, and applications. It features theoretical, experimental, and survey articles.