{"title":"Infrared pedestrian tracking network based on convolution model and transformer model fusion","authors":"Zhang Guiqiang, Wang X. Yi, She X. Xing","doi":"10.1117/12.2686716","DOIUrl":null,"url":null,"abstract":"Camera surveillance plays an important role in maintaining the stability and safety of the social and public environment, and there are further requirements for the role of camera surveillance in building a smart city. This paper proposes a convolutional neural network based on the combination of the convolution module and the Transformer module. The network is applied to the tracking of pedestrian targets in infrared surveillance cameras to fill the shortcomings of surveillance cameras in the night environment. In this paper, the local features of the convolution module and the global features of the Transformer are combined into a comprehensive feature map. The feature information is used to solve the problem of less target feature information in infrared images, and the advantages of codec network structure design are used to ensure effective target features. At the same time, considering the embedding and portability of the network model, this paper adopts the method of grouping shared convolution kernels and Transformer nested segmentation in the design of the convolution module and the Transformer module, so as to achieve the purpose of light weight. After several sets of control experiments, the network designed in this paper has a certain improvement in tracking speed and tracking performance, and effectively solves the problem that infrared weak and small targets are not easy to track.","PeriodicalId":324795,"journal":{"name":"3rd International Conference on Applied Mathematics, Modelling, and Intelligent Computing (CAMMIC 2023)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"3rd International Conference on Applied Mathematics, Modelling, and Intelligent Computing (CAMMIC 2023)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2686716","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Camera surveillance plays an important role in maintaining the stability and safety of the social and public environment, and there are further requirements for the role of camera surveillance in building a smart city. This paper proposes a convolutional neural network based on the combination of the convolution module and the Transformer module. The network is applied to the tracking of pedestrian targets in infrared surveillance cameras to fill the shortcomings of surveillance cameras in the night environment. In this paper, the local features of the convolution module and the global features of the Transformer are combined into a comprehensive feature map. The feature information is used to solve the problem of less target feature information in infrared images, and the advantages of codec network structure design are used to ensure effective target features. At the same time, considering the embedding and portability of the network model, this paper adopts the method of grouping shared convolution kernels and Transformer nested segmentation in the design of the convolution module and the Transformer module, so as to achieve the purpose of light weight. After several sets of control experiments, the network designed in this paper has a certain improvement in tracking speed and tracking performance, and effectively solves the problem that infrared weak and small targets are not easy to track.