{"title":"A Novel Cross Grouping CG MLP based on local mechanism","authors":"Hang Xu, Tao Wang, Wei Wen, Xingyu Liu","doi":"10.1109/CMVIT57620.2023.00021","DOIUrl":"https://doi.org/10.1109/CMVIT57620.2023.00021","url":null,"abstract":"Recently, Google proposed the MLP-Mixer – a simple multi-layer fully connected network, proving that convolutional and attention mechanisms are not irreplaceable. Although MLP-Mixer is simple, training it requires a lot of resources. In this paper, a network model——Cross Grouping MLP(CG MLP) based on local mechanism is proposed. The CG MLP module is a general visual task backbone that replaces the original MLP’s spatial mixing module. CG MLP introduces vertical and horizontal bar grouping in different channels of feature map to extract local information. CG MLP also introduces pyramid structure. For the input image, this model reduces the computational complexity of MLP from the square of the area(the fourth power of the side length) to the third power of the side length. CG MLP with 64M parameters achieved 82.5% accuracy on Imagenet-1K, and it reaches the SOTA performance of MLP models.","PeriodicalId":191655,"journal":{"name":"2023 7th International Conference on Machine Vision and Information Technology (CMVIT)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127320524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improved Root Sparse Bayesian Learning for DOA Estimation in Non-uniform Noise","authors":"Yifan Zhang, Hangfang Zhao","doi":"10.1109/CMVIT57620.2023.00017","DOIUrl":"https://doi.org/10.1109/CMVIT57620.2023.00017","url":null,"abstract":"The vigorous development of sparse signal reconstruction (SSR) technology provides a new idea for realizing direction-of-arrival (DOA) estimation. This paper proposes an improved root sparse Bayesian learning algorithm to solve the problem of poor estimation accuracy of traditional DOA estimation algorithms based on SSR technology under off-grid error and non-uniform noise. The improved algorithm not only achieves accurate estimation of the non-uniform noise through a small number of iterations but also uses the expectation-maximization (EM) algorithm to iteratively refine the discrete sampling grid, which shows that the calculation of updating the grid points can be realized by the root of a particular polynomial. The simulation proves that the algorithm has excellent estimation performance under the coarse grid and non-uniform noise.","PeriodicalId":191655,"journal":{"name":"2023 7th International Conference on Machine Vision and Information Technology (CMVIT)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122351430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Review of the Application of Machine Vision in Aquaculture UAVs","authors":"Yang Zhiling","doi":"10.1109/cmvit57620.2023.00022","DOIUrl":"https://doi.org/10.1109/cmvit57620.2023.00022","url":null,"abstract":"Aquaculture is difficult to be monitored in real-time due to the influence of vast water area and complex environment. Machine vision uses the machine to replace the human eye to convert the target into image signals and send them to the special image processing system, so as to get the morphological information of the target. Combining machine vision and UAV technology could revolutionize aquaculture. The real-time image processing device installed on the UAV can quickly and accurately identify fish schools or monitor water conditions. This paper explores the application of machine vision in aquaculture UAVs, which is of great significance for the following research.","PeriodicalId":191655,"journal":{"name":"2023 7th International Conference on Machine Vision and Information Technology (CMVIT)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126283805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improved Convolutional 3D Networks for Micro-Movements Recognition","authors":"Rui Yuan, Lihua Zhang","doi":"10.1109/cmvit57620.2023.00026","DOIUrl":"https://doi.org/10.1109/cmvit57620.2023.00026","url":null,"abstract":"It is of great significance for computers to recognize the actions in videos. The human body’s action recognition has been applied in many fields. The majority of action recognition methods have relatively low precision in recognizing micro-movements. In some specific scenarios, tasks such as intelligent home companionship for the elderly and early warning for dangerous driving behaviors, the micro-actions of the observed are extremely important in the recognition task. At the same time, due to the physiological characteristics of the elderly or the limitation of the environment, the amplitude of the actions is relatively small. This research suggests an action recognition method based on deep learning to better analyze micro-movements-oriented action recognition. Inspired by transformer, we split an image into fixed-size patches. The network structure of C3D is improved. The idea of image patch is introduced to reduce the receptive field of each region in the video frame. Finally, the experimental verification is performed on two action recognition datasets, UCF101 and NTU. The average accuracies on UCF101 and NTU respectively are 91.74% and 88.01%, which show that the proposed algorithm can effectively improve the recognition ability of micro-movements and obtain better results compared with other baselines.","PeriodicalId":191655,"journal":{"name":"2023 7th International Conference on Machine Vision and Information Technology (CMVIT)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130277436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A new method for producing temperature profiles based on ERA5 and RAOB","authors":"Yale Qiao","doi":"10.1109/CMVIT57620.2023.00016","DOIUrl":"https://doi.org/10.1109/CMVIT57620.2023.00016","url":null,"abstract":"Temperature profiles are important meteorological parameters of the atmosphere that can determine atmospheric thermal processes. Detecting global spatial and temporal continuous atmospheric temperature profiles is crucial for weather protection work. Atmospheric datasets such as ERA5 (fifth generation ECMWF reanalysis) provide global and continuous temperature profile datasets with good resolution. RAOB (radiosonde) sounding data have high confidence and representativeness and are commonly used for data accuracy validation. In this paper, we use the RAOB sounding data of 2017 as the true value and revise the ERA5 reanalysis data based on machine learning methods to optimize the data. The algorithm not only improves the problem of RAOB distribution discontinuity but also improves the accuracy of ERA5 itself. In order to verify the results of the algorithm, the RAOB sounding data are compared with it, and it is found that the accuracy of the revised data is reduced by about 3K compared to the preprocessing RMSE, which is closer to the RAOB data. The algorithm proposed in this paper can provide important data support for subsequent meteorological studies.","PeriodicalId":191655,"journal":{"name":"2023 7th International Conference on Machine Vision and Information Technology (CMVIT)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133279730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fast Estimation of Direction of Arrival for Towed Array Based on Sparse Bayesian Learning","authors":"Zican Zhang, Xiang Pan","doi":"10.1109/CMVIT57620.2023.00014","DOIUrl":"https://doi.org/10.1109/CMVIT57620.2023.00014","url":null,"abstract":"In order to solve the problem of slow convergence of the direction of arrival (DOA) estimation algorithm based on sparse Bayesian learning (SBL), a fast converging SBL(FCSBL) of DOA estimation algorithm is obtained by introducing an approximate posterior covariance in hyperparameter iteration. During maneuvering turns, the towed array is modeled as a parabolic array to correct the distortion of array shape. Taking the bow of the array as a hyperparameter for SBL, this paper proposes a fast converging adaptive bow sparse Bayesian learning algorithm, to jointly estimate array shape and DOAs from acoustic data. Numerical simulation and MAPEX2000 experimental data processing results show that FC-ABSBL performs well in detection of weak targets and estimation of the array bow during maneuvering turns with low computational load.","PeriodicalId":191655,"journal":{"name":"2023 7th International Conference on Machine Vision and Information Technology (CMVIT)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124839515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The influence and remodeling of artificial intelligence technology to China’s news dissemination industry : ——Taking the application of Baidu Brain AI core technology engine as an example","authors":"Boxiong Song","doi":"10.1109/cmvit57620.2023.00037","DOIUrl":"https://doi.org/10.1109/cmvit57620.2023.00037","url":null,"abstract":"Register the software, enter the keywords, adjust the parameters, and then you can wait for the work to be produced. In just a few minutes, a beautifully rendered image is presented to the user. If the user is satisfied with it, they can take the “painting” to sell or enter a competition or even win a gold medal in an international competition. Faced with such AI technology, some people are amazed, some people are angry, some people are upset, and some people are cheering. Many people call this a “game” between machine and human. What should we think about painting art from AI? This paper will discuss AI painting as the theme, analyze the basic principles and processes of AI painting, analyze the impact of AI painting on our production and life, and put forward some suggestions and thoughts for its development in the future.","PeriodicalId":191655,"journal":{"name":"2023 7th International Conference on Machine Vision and Information Technology (CMVIT)","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129858713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Image Dehazing based on Multi-scale Feature Fusion under Attention Mechanism","authors":"Shaotian Wang, Guihui Chen","doi":"10.1109/CMVIT57620.2023.00024","DOIUrl":"https://doi.org/10.1109/CMVIT57620.2023.00024","url":null,"abstract":"To solve the problems of insufficient feature extraction and the loss of too much image information in existing methods, a dehazing network based on multi-scale feature fusion under attention mechanism is proposed. Firstly, the base convolutional layer in U-Net is built using improved fully connected residual blocks to reduce the amount of computation. Secondly, the self-convolution block based on the self-attention mechanism is added to extract more delicate feature information of the image. Finally, to increase feature reuse and reduce feature information loss, the feature maps of different levels are fused using various scale gated units. In order to improve the capacity of the restored image to be recognized subjectively, the mixed loss function of multi-scale structural similarity and minimal absolute error is introduced. Experiments are carried out with synthetic haze data sets. Compared with other neural networks, the multi-scale structural similarity and peak signal-to-noise of the dehazed image of the proposed network are increased by 4.31% and 18.33% on average, respectively. The experiment results demonstrate that the network can efficiently avoid color distortion, halo and strong edge effect around the object, and the image has high subjective recognition after haze removal.","PeriodicalId":191655,"journal":{"name":"2023 7th International Conference on Machine Vision and Information Technology (CMVIT)","volume":"100 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114010558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Weichen Sun, ZhanHua Yang, Bo Zhao, Y. Wang, Zhonglin Yang, Yutong Jiang, Haiping Song
{"title":"Research on Target Detection of Regional Monitoring with Complex Background using CNN and Background Modelling","authors":"Weichen Sun, ZhanHua Yang, Bo Zhao, Y. Wang, Zhonglin Yang, Yutong Jiang, Haiping Song","doi":"10.1109/CMVIT57620.2023.00028","DOIUrl":"https://doi.org/10.1109/CMVIT57620.2023.00028","url":null,"abstract":"The regional monitoring systems aim to recognize and localize the target of interest in the region area. However, the target detection algorithm currently used in the regional monitoring system has problems such as low recognition probability under complex background conditions. The type of moving object recognized by the background modelling algorithm is difficult to judge. This paper proposes a monitoring area target detection method that fuses the detection results of the two YOLOv5 target detection algorithms and the Vibe background modelling method through Kalman filtering. Experiments show that the proposed method can improve the consistency and stability of target detection results in regional monitoring scenarios.","PeriodicalId":191655,"journal":{"name":"2023 7th International Conference on Machine Vision and Information Technology (CMVIT)","volume":"306 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132627792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hong‐lei Ma, Ran Shen, Jing Ye, Huajun Su, Hantian Xie, Han Jiang
{"title":"High-Automatical and High-Accurate Pupil Location Neural Network via FRST FPL","authors":"Hong‐lei Ma, Ran Shen, Jing Ye, Huajun Su, Hantian Xie, Han Jiang","doi":"10.1109/CMVIT57620.2023.00018","DOIUrl":"https://doi.org/10.1109/CMVIT57620.2023.00018","url":null,"abstract":"Pupil location refers to the location of the pupil or its center in an image. To solve the problem that the pupil location method is difficult to achieve high automation and high accuracy at the same time, this paper proposes a method combining image processing and statistical learning. In this paper, an improved algorithm of the fast radial symmetry transform (FRST) based on pupil location is proposed, namely FRSTFPL (fast radial symmetry transform for pupil location), which is used to coarsely localize the pupil in the image, followed by a shallow CNN to achieve precise localization. In addition, we construct a dataset based on the CASIA-IrisV4 iris image database and then conduct a variety of experiments. The results show that the location error of the proposed method in an image with a size of 640 × 480 pixels is 8.51 pixels, which exceeds the performance of the comparing methods. In our method, not only accurate radius and complex network are unnecessary, but also highly automated, low computational complexity, and relatively high localizing accuracy can be achieved together.","PeriodicalId":191655,"journal":{"name":"2023 7th International Conference on Machine Vision and Information Technology (CMVIT)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121284090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}