Zhenyu Huang, Gun Li, Xudong Sun, Yong Chen, Jie Sun, Zhangsong Ni, Yang Yang
{"title":"面向无人机实时跟踪的暹罗密集像素级融合网络","authors":"Zhenyu Huang, Gun Li, Xudong Sun, Yong Chen, Jie Sun, Zhangsong Ni, Yang Yang","doi":"10.32604/cmc.2023.039489","DOIUrl":null,"url":null,"abstract":"Onboard visual object tracking in unmanned aerial vehicles (UAVs) has attracted much interest due to its versatility. Meanwhile, due to high precision, Siamese networks are becoming hot spots in visual object tracking. However, most Siamese trackers fail to balance the tracking accuracy and time within onboard limited computational resources of UAVs. To meet the tracking precision and real-time requirements, this paper proposes a Siamese dense pixel-level network for UAV object tracking named SiamDPL. Specifically, the Siamese network extracts features of the search region and the template region through a parameter-shared backbone network, then performs correlation matching to obtain the candidate region with high similarity. To improve the matching effect of template and search features, this paper designs a dense pixel-level feature fusion module to enhance the matching ability by pixel-wise correlation and enrich the feature diversity by dense connection. An attention module composed of self-attention and channel attention is introduced to learn global context information and selectively emphasize the target feature region in the spatial and channel dimensions. In addition, a target localization module is designed to improve target location accuracy. Compared with other advanced trackers, experiments on two public benchmarks, which are UAV123@10fps and UAV20L from the unmanned air vehicle123 (UAV123) dataset, show that SiamDPL can achieve superior performance and low complexity with a running speed of 100.1 fps on NVIDIA TITAN RTX.","PeriodicalId":93535,"journal":{"name":"Computers, materials & continua","volume":"110 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Siamese Dense Pixel-Level Fusion Network for Real-Time UAV Tracking\",\"authors\":\"Zhenyu Huang, Gun Li, Xudong Sun, Yong Chen, Jie Sun, Zhangsong Ni, Yang Yang\",\"doi\":\"10.32604/cmc.2023.039489\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Onboard visual object tracking in unmanned aerial vehicles (UAVs) has attracted much interest due to its versatility. Meanwhile, due to high precision, Siamese networks are becoming hot spots in visual object tracking. However, most Siamese trackers fail to balance the tracking accuracy and time within onboard limited computational resources of UAVs. To meet the tracking precision and real-time requirements, this paper proposes a Siamese dense pixel-level network for UAV object tracking named SiamDPL. Specifically, the Siamese network extracts features of the search region and the template region through a parameter-shared backbone network, then performs correlation matching to obtain the candidate region with high similarity. To improve the matching effect of template and search features, this paper designs a dense pixel-level feature fusion module to enhance the matching ability by pixel-wise correlation and enrich the feature diversity by dense connection. An attention module composed of self-attention and channel attention is introduced to learn global context information and selectively emphasize the target feature region in the spatial and channel dimensions. In addition, a target localization module is designed to improve target location accuracy. Compared with other advanced trackers, experiments on two public benchmarks, which are UAV123@10fps and UAV20L from the unmanned air vehicle123 (UAV123) dataset, show that SiamDPL can achieve superior performance and low complexity with a running speed of 100.1 fps on NVIDIA TITAN RTX.\",\"PeriodicalId\":93535,\"journal\":{\"name\":\"Computers, materials & continua\",\"volume\":\"110 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers, materials & continua\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.32604/cmc.2023.039489\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers, materials & continua","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32604/cmc.2023.039489","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Siamese Dense Pixel-Level Fusion Network for Real-Time UAV Tracking
Onboard visual object tracking in unmanned aerial vehicles (UAVs) has attracted much interest due to its versatility. Meanwhile, due to high precision, Siamese networks are becoming hot spots in visual object tracking. However, most Siamese trackers fail to balance the tracking accuracy and time within onboard limited computational resources of UAVs. To meet the tracking precision and real-time requirements, this paper proposes a Siamese dense pixel-level network for UAV object tracking named SiamDPL. Specifically, the Siamese network extracts features of the search region and the template region through a parameter-shared backbone network, then performs correlation matching to obtain the candidate region with high similarity. To improve the matching effect of template and search features, this paper designs a dense pixel-level feature fusion module to enhance the matching ability by pixel-wise correlation and enrich the feature diversity by dense connection. An attention module composed of self-attention and channel attention is introduced to learn global context information and selectively emphasize the target feature region in the spatial and channel dimensions. In addition, a target localization module is designed to improve target location accuracy. Compared with other advanced trackers, experiments on two public benchmarks, which are UAV123@10fps and UAV20L from the unmanned air vehicle123 (UAV123) dataset, show that SiamDPL can achieve superior performance and low complexity with a running speed of 100.1 fps on NVIDIA TITAN RTX.