{"title":"基于GCT关注和双模板更新的Siamese网络视觉跟踪算法","authors":"Sugang Ma, Siwei Sun, Lei Pu, Xiaobao Yang","doi":"10.1109/ICNLP58431.2023.00014","DOIUrl":null,"url":null,"abstract":"To address the problem of insufficient representational capability and lack of online update of the Fully-convolutional Siamese Network (SiamFC) tracker in complex scenes, this paper proposes a siamese network visual tracking algorithm based on GCT attention and dual-template update mechanism. First, the feature extraction network is constructed by replacing AlexNet with the VGG16 network and SoftPool is used to replace the maximum pooling layer. Secondly, the attention module is added after the backbone network to enhance the network’s ability to extract object features. Finally, a dual-template update mechanism is designed for response map fusion. Average Peak-to-Correlation Energy (APCE) is used to determine whether to update the dynamic templates, effectively improving the tracking robustness. The proposed algorithm is trained on the Got-10k dataset and tested on the OTB2015 and VOT2018 datasets. The experimental results show that, compared with SiamFC, the success rate and accuracy reach 0.663 and 0.891 on the OTB2015, which improve respectively 7.6% and 11.9%; On the VOT2018 dataset, the tracking accuracy, robustness and EAO are improved respectively by 2.9%, 29% and 14%. The proposed algorithm achieves high tracking accuracy in complex scenes and the tracking speed reaches 52.6 Fps, which meets the real-time tracking requirements.","PeriodicalId":53637,"journal":{"name":"Icon","volume":"51 1","pages":"31-36"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Siamese Network Visual Tracking Algorithm Based on GCT Attention and Dual-Template Update\",\"authors\":\"Sugang Ma, Siwei Sun, Lei Pu, Xiaobao Yang\",\"doi\":\"10.1109/ICNLP58431.2023.00014\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To address the problem of insufficient representational capability and lack of online update of the Fully-convolutional Siamese Network (SiamFC) tracker in complex scenes, this paper proposes a siamese network visual tracking algorithm based on GCT attention and dual-template update mechanism. First, the feature extraction network is constructed by replacing AlexNet with the VGG16 network and SoftPool is used to replace the maximum pooling layer. Secondly, the attention module is added after the backbone network to enhance the network’s ability to extract object features. Finally, a dual-template update mechanism is designed for response map fusion. Average Peak-to-Correlation Energy (APCE) is used to determine whether to update the dynamic templates, effectively improving the tracking robustness. The proposed algorithm is trained on the Got-10k dataset and tested on the OTB2015 and VOT2018 datasets. The experimental results show that, compared with SiamFC, the success rate and accuracy reach 0.663 and 0.891 on the OTB2015, which improve respectively 7.6% and 11.9%; On the VOT2018 dataset, the tracking accuracy, robustness and EAO are improved respectively by 2.9%, 29% and 14%. The proposed algorithm achieves high tracking accuracy in complex scenes and the tracking speed reaches 52.6 Fps, which meets the real-time tracking requirements.\",\"PeriodicalId\":53637,\"journal\":{\"name\":\"Icon\",\"volume\":\"51 1\",\"pages\":\"31-36\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Icon\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICNLP58431.2023.00014\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Arts and Humanities\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Icon","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNLP58431.2023.00014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Arts and Humanities","Score":null,"Total":0}
Siamese Network Visual Tracking Algorithm Based on GCT Attention and Dual-Template Update
To address the problem of insufficient representational capability and lack of online update of the Fully-convolutional Siamese Network (SiamFC) tracker in complex scenes, this paper proposes a siamese network visual tracking algorithm based on GCT attention and dual-template update mechanism. First, the feature extraction network is constructed by replacing AlexNet with the VGG16 network and SoftPool is used to replace the maximum pooling layer. Secondly, the attention module is added after the backbone network to enhance the network’s ability to extract object features. Finally, a dual-template update mechanism is designed for response map fusion. Average Peak-to-Correlation Energy (APCE) is used to determine whether to update the dynamic templates, effectively improving the tracking robustness. The proposed algorithm is trained on the Got-10k dataset and tested on the OTB2015 and VOT2018 datasets. The experimental results show that, compared with SiamFC, the success rate and accuracy reach 0.663 and 0.891 on the OTB2015, which improve respectively 7.6% and 11.9%; On the VOT2018 dataset, the tracking accuracy, robustness and EAO are improved respectively by 2.9%, 29% and 14%. The proposed algorithm achieves high tracking accuracy in complex scenes and the tracking speed reaches 52.6 Fps, which meets the real-time tracking requirements.