{"title":"用于识别的轻量级、规模化自适应高效骨干网络","authors":"Chao Wang, Kaijie Zhang, Xiaoyong Yu, Xianpeng Xiong, Aihua Zheng","doi":"10.59879/ny20e","DOIUrl":null,"url":null,"abstract":"Vehicle refinement recognition related technology research is widely used in the field of mine monitoring and management systems, road traffic command and control, etc. As researchers develop and implement the target recognition technology system based on deep learning algorithms, designing a target recognition algorithm with excellent performance is a research priority within the field of vehicle monitoring. In this paper, we propose an Efficient Net algorithm based recognition method for vehicle front-end and vehicle rear-end recognition to address the shortcomings of the current methods used for vehicle front-end and vehicle rear-end recognition, and verify the reliability of the algorithm using experiments. Algorithm systematically investigates model scaling, the backbone network makes extensive use of the MBConv structure to extract the feature maps, which cuts short the time required for model training, and the structure introduces the SE module to perform global averaging pooling operations in the channel dimension direction to enhance model performance, so that the network has the dual advantages of network model size and recognition accuracy at the same time. Based on the above findings, we improve the inverse residual module of the backbone feature extraction network EfficientNet by introducing the coordinate attention mechanism (CA) to average the spatial feature information in X-axis and Y-axis dimensions respectively, with the feature layer size and number of channels unchanged, and change the residual edge to shorten the input and output of high-dimensional channels to improve the accuracy of model feature extraction. Meanwhile, this paper introduces a depth-separable convolutional neural network and agent-normalized activation in the mobile flip-flop convolutional module to offset the two different dimensions of X-axis and Y-axis between each convolutional layer but the two main sources of non-normalization, so as to achieve the improvement of the target detection rate and accuracy.","PeriodicalId":49454,"journal":{"name":"Sylwan","volume":null,"pages":null},"PeriodicalIF":0.5000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Lightweight and Scale Adaptive Efficient backbone Network for Recognition\",\"authors\":\"Chao Wang, Kaijie Zhang, Xiaoyong Yu, Xianpeng Xiong, Aihua Zheng\",\"doi\":\"10.59879/ny20e\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Vehicle refinement recognition related technology research is widely used in the field of mine monitoring and management systems, road traffic command and control, etc. As researchers develop and implement the target recognition technology system based on deep learning algorithms, designing a target recognition algorithm with excellent performance is a research priority within the field of vehicle monitoring. In this paper, we propose an Efficient Net algorithm based recognition method for vehicle front-end and vehicle rear-end recognition to address the shortcomings of the current methods used for vehicle front-end and vehicle rear-end recognition, and verify the reliability of the algorithm using experiments. Algorithm systematically investigates model scaling, the backbone network makes extensive use of the MBConv structure to extract the feature maps, which cuts short the time required for model training, and the structure introduces the SE module to perform global averaging pooling operations in the channel dimension direction to enhance model performance, so that the network has the dual advantages of network model size and recognition accuracy at the same time. Based on the above findings, we improve the inverse residual module of the backbone feature extraction network EfficientNet by introducing the coordinate attention mechanism (CA) to average the spatial feature information in X-axis and Y-axis dimensions respectively, with the feature layer size and number of channels unchanged, and change the residual edge to shorten the input and output of high-dimensional channels to improve the accuracy of model feature extraction. Meanwhile, this paper introduces a depth-separable convolutional neural network and agent-normalized activation in the mobile flip-flop convolutional module to offset the two different dimensions of X-axis and Y-axis between each convolutional layer but the two main sources of non-normalization, so as to achieve the improvement of the target detection rate and accuracy.\",\"PeriodicalId\":49454,\"journal\":{\"name\":\"Sylwan\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.5000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Sylwan\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.59879/ny20e\",\"RegionNum\":4,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"FORESTRY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sylwan","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.59879/ny20e","RegionNum":4,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"FORESTRY","Score":null,"Total":0}
Lightweight and Scale Adaptive Efficient backbone Network for Recognition
Vehicle refinement recognition related technology research is widely used in the field of mine monitoring and management systems, road traffic command and control, etc. As researchers develop and implement the target recognition technology system based on deep learning algorithms, designing a target recognition algorithm with excellent performance is a research priority within the field of vehicle monitoring. In this paper, we propose an Efficient Net algorithm based recognition method for vehicle front-end and vehicle rear-end recognition to address the shortcomings of the current methods used for vehicle front-end and vehicle rear-end recognition, and verify the reliability of the algorithm using experiments. Algorithm systematically investigates model scaling, the backbone network makes extensive use of the MBConv structure to extract the feature maps, which cuts short the time required for model training, and the structure introduces the SE module to perform global averaging pooling operations in the channel dimension direction to enhance model performance, so that the network has the dual advantages of network model size and recognition accuracy at the same time. Based on the above findings, we improve the inverse residual module of the backbone feature extraction network EfficientNet by introducing the coordinate attention mechanism (CA) to average the spatial feature information in X-axis and Y-axis dimensions respectively, with the feature layer size and number of channels unchanged, and change the residual edge to shorten the input and output of high-dimensional channels to improve the accuracy of model feature extraction. Meanwhile, this paper introduces a depth-separable convolutional neural network and agent-normalized activation in the mobile flip-flop convolutional module to offset the two different dimensions of X-axis and Y-axis between each convolutional layer but the two main sources of non-normalization, so as to achieve the improvement of the target detection rate and accuracy.
期刊介绍:
SYLWAN jest najstarszym w Polsce leśnym czasopismem naukowym, jednym z pierwszych na świecie. Został założony w 1820 roku w Warszawie. Przyczynił się w znakomity sposób do rozwoju polskiego leśnictwa, służąc postępowi, upowszechnieniu wiedzy leśnej oraz rozwojowi nauki.