{"title":"Lightweight Conv-Swin Transformer for Wildlife Detection","authors":"Guobin Yang, Chenhong Sui, Fuhao Jiang, Yunhao Pan, Ankang Zang, Jian Hu","doi":"10.1109/ICARCE55724.2022.10046623","DOIUrl":null,"url":null,"abstract":"Wildlife detection is of great significance for wildlife monitoring and protection. Among existing object detection methods, Faster RCNN is a typical two-stage object detection method. In despite of its effectiveness, it suffers from the less satisfactory detection accuracy. This is mainly limited by the insufficient global representation of both objects and scenes. To this end, this paper proposes a lightweight Conv-Swin Transformer method for wildlife detection involving a lightweight combination of both convolution and Swin Transformer. In this study, Lightweight improvements are made in two main ways. The first one is done by reducing the number of Blocks in the third stage of the Swin Transformer; the second one is done by optimizing the down-sampling of different stages in the Swin Transformer network through the convolutional structure, which can speed up the detection of the model and improve the detection efficiency. The Faster RCNN model was chosen for experiments on a self-constructed wildlife dataset, using three different CNNs as well as the Swin Transformer as the backbone network for comparison. Experimental results show that the improved Conv-Swin Transformer, which combines the advantages of the attention mechanism and the convolutional structure, improves detection speed by 17.5% with a slight reduction in detection accuracy.","PeriodicalId":416305,"journal":{"name":"2022 International Conference on Automation, Robotics and Computer Engineering (ICARCE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Automation, Robotics and Computer Engineering (ICARCE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICARCE55724.2022.10046623","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Wildlife detection is of great significance for wildlife monitoring and protection. Among existing object detection methods, Faster RCNN is a typical two-stage object detection method. In despite of its effectiveness, it suffers from the less satisfactory detection accuracy. This is mainly limited by the insufficient global representation of both objects and scenes. To this end, this paper proposes a lightweight Conv-Swin Transformer method for wildlife detection involving a lightweight combination of both convolution and Swin Transformer. In this study, Lightweight improvements are made in two main ways. The first one is done by reducing the number of Blocks in the third stage of the Swin Transformer; the second one is done by optimizing the down-sampling of different stages in the Swin Transformer network through the convolutional structure, which can speed up the detection of the model and improve the detection efficiency. The Faster RCNN model was chosen for experiments on a self-constructed wildlife dataset, using three different CNNs as well as the Swin Transformer as the backbone network for comparison. Experimental results show that the improved Conv-Swin Transformer, which combines the advantages of the attention mechanism and the convolutional structure, improves detection speed by 17.5% with a slight reduction in detection accuracy.