Shicheng Zu, Kai Yang, Xiulai Wang, Zhongzheng Yu, Yawen Hu, Jia Long
{"title":"UAVs-based Small Object Detection and Tracking in Various Complex Scenarios","authors":"Shicheng Zu, Kai Yang, Xiulai Wang, Zhongzheng Yu, Yawen Hu, Jia Long","doi":"10.1145/3459104.3459141","DOIUrl":null,"url":null,"abstract":"We have witnessed drastic progress in object detection in recent years due to the development of neural networks. Most mainstream object detectors are inclined to detect objects of regular scale because their detection depends on deep convolutional feature maps. Our study focused on UAVs-based small object detection at a high altitude, i.e., 100 meters. We constructed a pipeline by integrating the foreground segmentation algorithm, the image classification algorithm, the boosted cascaded classifier, and the tracker together that can detect and track the small object progressively in a cascaded manner. We performed the qualitative and quantitative evaluation of our pipeline's performance under various complex conditions. The comparison study confirmed its superiority in small object detection and strong robustness against various influential nuisances. Based on our constructed pipeline, we developed a real-time UAVs-based small object detection and tracking system. The system architecture and the general steps taken by the UAVs to realize small object detection were also presented. Finally, we qualitatively and quantitatively evaluated 8 popular trackers based on relevant image attributes. The most suitable tracker can be determined in response to a given condition. Our study testified that by taking advantage of each algorithm germane to a given task, the implementation performance can be improved. We also performed a quantitative evaluation of the 8 trackers on each pertinent image attribute. The results are shown in table 2. For each attribute, we highlighted the most suitable tracker in bold. In term of IV, the trackers utilizing feature assembly, i.e., the CSR-DCF and AdaBoost or the trackers using the texture features, i.e., LBP and HoG, usually perform better because the texture features are not sensitive to the IV [21]. The MIL tracker with the Haar-like features, however, is sensitive to the IV because the Haar-like features reflect the pixel intensity variations by subtracting pixel intensities between adjacent rectangular regions [21]. As far as OCC is concerned, the AdaBoost has superior performance because it allows online switching of multiple features for every frame [19]. The KCF shows diminished performance because the FFT requires the filter and the search region size to be equal limiting the detection range [17]. The reduced performance is also observed in the GOTURN since it estimates the object's location with one forward pass [20]. For MB, the MOSSE has improved performance because the correlation between the filter and the image becomes an element-wise multiplication in Fourier domain [16]. The MEDIANFLOW tracker does not perform well in MB because the rapid unpredictable motion causes a large discrepancy between the forward and backward tracking trajectories [22]. The OV resembles the occlusion in some respects. The MOSSE has improved performance in OV because it can detect occlusion via Peak-To-Sidelobe Ratio (PSR) and reinitiate tracking if the object reappears [16]. The TLD tracker also has enhanced performance because of its failure-safe detector to detect the object upon tracking failure [23]. The KCF's performance is degraded due to the lack of a failure recovery mechanism [17]. In term of BC, the CSR-DCF is good at coping with BC because of the spatial reliability map [18]. For LR, the KCF has poor performance because of the inadaptation of its initial circulant matrices to resolution variations [17]. In a nutshell, for the conditions which are more challenging, the CSR-DCF is a preferred choice while for conditions that are less complicated, the AdaBoost usually performs better.","PeriodicalId":142284,"journal":{"name":"2021 International Symposium on Electrical, Electronics and Information Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Symposium on Electrical, Electronics and Information Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3459104.3459141","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
We have witnessed drastic progress in object detection in recent years due to the development of neural networks. Most mainstream object detectors are inclined to detect objects of regular scale because their detection depends on deep convolutional feature maps. Our study focused on UAVs-based small object detection at a high altitude, i.e., 100 meters. We constructed a pipeline by integrating the foreground segmentation algorithm, the image classification algorithm, the boosted cascaded classifier, and the tracker together that can detect and track the small object progressively in a cascaded manner. We performed the qualitative and quantitative evaluation of our pipeline's performance under various complex conditions. The comparison study confirmed its superiority in small object detection and strong robustness against various influential nuisances. Based on our constructed pipeline, we developed a real-time UAVs-based small object detection and tracking system. The system architecture and the general steps taken by the UAVs to realize small object detection were also presented. Finally, we qualitatively and quantitatively evaluated 8 popular trackers based on relevant image attributes. The most suitable tracker can be determined in response to a given condition. Our study testified that by taking advantage of each algorithm germane to a given task, the implementation performance can be improved. We also performed a quantitative evaluation of the 8 trackers on each pertinent image attribute. The results are shown in table 2. For each attribute, we highlighted the most suitable tracker in bold. In term of IV, the trackers utilizing feature assembly, i.e., the CSR-DCF and AdaBoost or the trackers using the texture features, i.e., LBP and HoG, usually perform better because the texture features are not sensitive to the IV [21]. The MIL tracker with the Haar-like features, however, is sensitive to the IV because the Haar-like features reflect the pixel intensity variations by subtracting pixel intensities between adjacent rectangular regions [21]. As far as OCC is concerned, the AdaBoost has superior performance because it allows online switching of multiple features for every frame [19]. The KCF shows diminished performance because the FFT requires the filter and the search region size to be equal limiting the detection range [17]. The reduced performance is also observed in the GOTURN since it estimates the object's location with one forward pass [20]. For MB, the MOSSE has improved performance because the correlation between the filter and the image becomes an element-wise multiplication in Fourier domain [16]. The MEDIANFLOW tracker does not perform well in MB because the rapid unpredictable motion causes a large discrepancy between the forward and backward tracking trajectories [22]. The OV resembles the occlusion in some respects. The MOSSE has improved performance in OV because it can detect occlusion via Peak-To-Sidelobe Ratio (PSR) and reinitiate tracking if the object reappears [16]. The TLD tracker also has enhanced performance because of its failure-safe detector to detect the object upon tracking failure [23]. The KCF's performance is degraded due to the lack of a failure recovery mechanism [17]. In term of BC, the CSR-DCF is good at coping with BC because of the spatial reliability map [18]. For LR, the KCF has poor performance because of the inadaptation of its initial circulant matrices to resolution variations [17]. In a nutshell, for the conditions which are more challenging, the CSR-DCF is a preferred choice while for conditions that are less complicated, the AdaBoost usually performs better.