Yuanyun Wang, Wenshuang Zhang, Limin Zhang, Changwang Lai, Jun Wang
{"title":"用于视觉跟踪的深度超参数化暹罗网络","authors":"Yuanyun Wang, Wenshuang Zhang, Limin Zhang, Changwang Lai, Jun Wang","doi":"10.1109/ICITBE54178.2021.00022","DOIUrl":null,"url":null,"abstract":"Siamese Network based trackers have achieved excellent performance because of their balanced accuracy and speed. Extracting effective feature of the target template and search regions is a very important problem in visual tracking. Generally, existing trackers use Convolutional Neural Network (CNN) to extract features, which make full use of the depth feature while the spatial structure information is ignored. The spatial structure information is very helpful to represent appearance characteristics of targets and templates. In this paper, we propose a novel tracking algorithm based on Siamese network. Specifically, it consists of Siamese subnetwork for feature extraction and cross correlation. Subnetwork improves feature learning capability of extracting shallow spatial information and deep semantic information, and accelerates the model training. Extensive experiments conducted on three benchmarks including OTB2015 and VOT2016 show that the proposed DOSiam tracker has superior performances and real-time response against state-of-the-art trackers while runs at more than 60 FPS.","PeriodicalId":207276,"journal":{"name":"2021 International Conference on Information Technology and Biomedical Engineering (ICITBE)","volume":"91 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Depthwise Over-parameterized Siamese Network for Visual Tracking\",\"authors\":\"Yuanyun Wang, Wenshuang Zhang, Limin Zhang, Changwang Lai, Jun Wang\",\"doi\":\"10.1109/ICITBE54178.2021.00022\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Siamese Network based trackers have achieved excellent performance because of their balanced accuracy and speed. Extracting effective feature of the target template and search regions is a very important problem in visual tracking. Generally, existing trackers use Convolutional Neural Network (CNN) to extract features, which make full use of the depth feature while the spatial structure information is ignored. The spatial structure information is very helpful to represent appearance characteristics of targets and templates. In this paper, we propose a novel tracking algorithm based on Siamese network. Specifically, it consists of Siamese subnetwork for feature extraction and cross correlation. Subnetwork improves feature learning capability of extracting shallow spatial information and deep semantic information, and accelerates the model training. Extensive experiments conducted on three benchmarks including OTB2015 and VOT2016 show that the proposed DOSiam tracker has superior performances and real-time response against state-of-the-art trackers while runs at more than 60 FPS.\",\"PeriodicalId\":207276,\"journal\":{\"name\":\"2021 International Conference on Information Technology and Biomedical Engineering (ICITBE)\",\"volume\":\"91 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference on Information Technology and Biomedical Engineering (ICITBE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICITBE54178.2021.00022\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Information Technology and Biomedical Engineering (ICITBE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICITBE54178.2021.00022","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Depthwise Over-parameterized Siamese Network for Visual Tracking
Siamese Network based trackers have achieved excellent performance because of their balanced accuracy and speed. Extracting effective feature of the target template and search regions is a very important problem in visual tracking. Generally, existing trackers use Convolutional Neural Network (CNN) to extract features, which make full use of the depth feature while the spatial structure information is ignored. The spatial structure information is very helpful to represent appearance characteristics of targets and templates. In this paper, we propose a novel tracking algorithm based on Siamese network. Specifically, it consists of Siamese subnetwork for feature extraction and cross correlation. Subnetwork improves feature learning capability of extracting shallow spatial information and deep semantic information, and accelerates the model training. Extensive experiments conducted on three benchmarks including OTB2015 and VOT2016 show that the proposed DOSiam tracker has superior performances and real-time response against state-of-the-art trackers while runs at more than 60 FPS.