Weiyue Bao, Hong Zhang, Yaoyao Ding, Fangzhou Shen, Liujun Li
{"title":"EdgeNet: a low-power image recognition model based on small sample information","authors":"Weiyue Bao, Hong Zhang, Yaoyao Ding, Fangzhou Shen, Liujun Li","doi":"10.1007/s10044-024-01289-6","DOIUrl":null,"url":null,"abstract":"<p>Existing deep convolutional neural networks that rely on large datasets typically require images with high resolution and deep neural network models trained and called upon to improve accuracy of image recognition and classification. It is needed to use lightweight model to adapt to such low-power devices. However, lightweight small models are limited in their ability to classify and recognize small-sized images with low-resolution and are constrained by the number of parameters in the model and unable to perform deep-level feature extraction, since the low-resolution indicates small sample information. In the intelligent interaction in digital media, capturing, storing, transmitting, and computing high-resolution, high-precision images incur high power consumption and operating costs. When deploying an image recognition system on the client-side of IoT devices, it is difficult to meet the hardware requirements of high storage space and fast computation speed. It is also challenging to directly use high-resolution image data for model fine-tuning and training, and the size and parameter updates of the model are also limited by the storage and operating capacity of the hardware facilities. We proposed a low-power image recognition framework consists data pre-processing part and lightweight modeling architecture part. The data pre-processing method for image data based on an Auto-Encoder that filters R, G, B color channel data using a resolution filter to realize data compression, that is Downscaling large input data to a smaller size, thus to address the limitations of low-power deep learning model deployment and training. Based on the resolution filter, a channel normalization method is proposed to perform batch normalization on each channel dimension to encode the original image data at the same size and improve the mean squared error discrimination of the image data. And the lightweight model uses a depth-separable convolutional neural network and two kinds of blocks: one with batch normalization and the other without, EdgeNet. The architecture makes it possible to deploy more suitable for IoT device. The proposed framework achieves only a small precision loss within permission, but improves the forward inference speed of the model, and reduce the memory storage to 8.7 MB.</p>","PeriodicalId":54639,"journal":{"name":"Pattern Analysis and Applications","volume":"24 1","pages":""},"PeriodicalIF":3.7000,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Analysis and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10044-024-01289-6","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Existing deep convolutional neural networks that rely on large datasets typically require images with high resolution and deep neural network models trained and called upon to improve accuracy of image recognition and classification. It is needed to use lightweight model to adapt to such low-power devices. However, lightweight small models are limited in their ability to classify and recognize small-sized images with low-resolution and are constrained by the number of parameters in the model and unable to perform deep-level feature extraction, since the low-resolution indicates small sample information. In the intelligent interaction in digital media, capturing, storing, transmitting, and computing high-resolution, high-precision images incur high power consumption and operating costs. When deploying an image recognition system on the client-side of IoT devices, it is difficult to meet the hardware requirements of high storage space and fast computation speed. It is also challenging to directly use high-resolution image data for model fine-tuning and training, and the size and parameter updates of the model are also limited by the storage and operating capacity of the hardware facilities. We proposed a low-power image recognition framework consists data pre-processing part and lightweight modeling architecture part. The data pre-processing method for image data based on an Auto-Encoder that filters R, G, B color channel data using a resolution filter to realize data compression, that is Downscaling large input data to a smaller size, thus to address the limitations of low-power deep learning model deployment and training. Based on the resolution filter, a channel normalization method is proposed to perform batch normalization on each channel dimension to encode the original image data at the same size and improve the mean squared error discrimination of the image data. And the lightweight model uses a depth-separable convolutional neural network and two kinds of blocks: one with batch normalization and the other without, EdgeNet. The architecture makes it possible to deploy more suitable for IoT device. The proposed framework achieves only a small precision loss within permission, but improves the forward inference speed of the model, and reduce the memory storage to 8.7 MB.
期刊介绍:
The journal publishes high quality articles in areas of fundamental research in intelligent pattern analysis and applications in computer science and engineering. It aims to provide a forum for original research which describes novel pattern analysis techniques and industrial applications of the current technology. In addition, the journal will also publish articles on pattern analysis applications in medical imaging. The journal solicits articles that detail new technology and methods for pattern recognition and analysis in applied domains including, but not limited to, computer vision and image processing, speech analysis, robotics, multimedia, document analysis, character recognition, knowledge engineering for pattern recognition, fractal analysis, and intelligent control. The journal publishes articles on the use of advanced pattern recognition and analysis methods including statistical techniques, neural networks, genetic algorithms, fuzzy pattern recognition, machine learning, and hardware implementations which are either relevant to the development of pattern analysis as a research area or detail novel pattern analysis applications. Papers proposing new classifier systems or their development, pattern analysis systems for real-time applications, fuzzy and temporal pattern recognition and uncertainty management in applied pattern recognition are particularly solicited.