Yewei Shi, Xiao Yao, Ruixuan Chen, Lili Yuan, Ning Xu, Xiaofeng Liu
{"title":"Image recognition based on multi-scale dilated lightweight network model","authors":"Yewei Shi, Xiao Yao, Ruixuan Chen, Lili Yuan, Ning Xu, Xiaofeng Liu","doi":"10.1145/3381271.3381300","DOIUrl":"https://doi.org/10.1145/3381271.3381300","url":null,"abstract":"Lightweight model is mainly applied to maintain performance and reduce the amount of parameters, simplifying the complex laboratory model to the mobile embedded device. We present a multi-scale dilated lightweight network model for image recognition. ShuffleNet is an classical lightweight neural network that proposes channel shuffle to help exchange information between groups during group convolution. However, ShuffleNet does not make full use of each group of information after channel shuffle. Since channel shuffle guarantees that each group contains the information of other groups, in this paper, we propose to process the grouping data with different dilated convolution, and obtain the multi-scale information of different receptive fields without increasing parameters. At the same time, we make an improvement on the network model to reduce the gridding artifacts caused by dilated convolution. Experiments on CIFAR-10 and EMNIST show that the improved algorithm performs better than traditional method.","PeriodicalId":124651,"journal":{"name":"Proceedings of the 5th International Conference on Multimedia and Image Processing","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114956822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Defect detection in ID cards with accurately reconstructed reference image","authors":"Xue Chen, Jianwen Cao, Yu-Peng Wang","doi":"10.1145/3381271.3381298","DOIUrl":"https://doi.org/10.1145/3381271.3381298","url":null,"abstract":"ID card is made by hot-pressing a standard film with identifiable information onto a fixed baseboard with background of wavy lines. In this paper, we propose a defect detect algorithm by synthesising the film image and baseboard image to accurately reconstruct a reference image for a test card. First, to ensure the content consistency in position and scale, we align the card to a standard film image through perspective transformation(PT) based on AKAZE key-points. Besides, we use contrast limited adaptive histogram equalization(CLAHE) to enhance the background pattern of a baseboard image, and then align it to the rectified card. Second, we apply multiply algorithm to synthesise the aligned film image and baseboard image as a reconstructed reference image. Besides, we align the lightness histogram of the reference image to a test card so as to eliminate the lighting difference. Finally, we apply the difference image method based on canny edge detection to detect difference between a reference image with a card, and further extract the defect information. We experiment on cards with different types of defects and shooting disturbances. Results show high accuracy of our method.","PeriodicalId":124651,"journal":{"name":"Proceedings of the 5th International Conference on Multimedia and Image Processing","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122867154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Video dehazing based on CNN","authors":"Xing Zhao, Ting Zhang, Xiang Zhan, Wenxin Chen","doi":"10.1145/3381271.3381278","DOIUrl":"https://doi.org/10.1145/3381271.3381278","url":null,"abstract":"The appearance of outdoor images is easily affected by natural phenomena such as fog and dust, which reduces contrast and color distortion. Video dehazing has a wide range of real-time applications, but the challenges mainly come from large amount of computation and bad real-time performance. In this paper, we propose a video dehazing system which is an end-to-end network based on CNN (Convolutional Neural Network). The dehazing algorithm learns the scene transmission and the global atmospheric light simultaneously, which simplifies the dehaze process and improves the real-time performance. Finally, we process videos through combining the end-to-end dehaze network and bicubic interpolation algorithm, and obtain satisfactory results. The experiment results demonstrate that the proposed method performs favorably against the state-of-the-art methods on both quantitative and qualitative evaluation.","PeriodicalId":124651,"journal":{"name":"Proceedings of the 5th International Conference on Multimedia and Image Processing","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114406806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The design of an intelligent monitoring system for human hand behaviors","authors":"Zhengliang Wu, Mingfeng Lu, Chenchen Ji","doi":"10.1145/3381271.3381290","DOIUrl":"https://doi.org/10.1145/3381271.3381290","url":null,"abstract":"Intelligent monitoring to observe human behaviors and report anomaly activities is a common application of computer vision technologies. However, to the authors knowledge, there has not been a widely accepted structure of building such a system with the fast-developing deep learning method. Within the author's knowledge, current works focus on industrial and traffic conditions, such as measuring the speeds of vehicles in highways, or to help arrange agriculture productions. This paper presents an efficient approach to applying deep learning techniques in such systems to help analyze human behaviors. Specifically, we combined the advancing object detection, pose estimation and image classification methods in our work to recognize some anomaly or special behaviors in determined scenes.","PeriodicalId":124651,"journal":{"name":"Proceedings of the 5th International Conference on Multimedia and Image Processing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129547008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Information hiding in scanned binary image of chinese characters","authors":"Wen Wen","doi":"10.1145/3381271.3381301","DOIUrl":"https://doi.org/10.1145/3381271.3381301","url":null,"abstract":"Because most existing algorithms rarely consider the inherent characteristics of Chinese characters, information hiding in binary image of Chinese characters causes large distortion to the original binary image. To solve this problem, this paper proposes an algorithm for hiding and extracting information on binary image of Chinese characters. For information hiding, the scanned image is firstly geometrically corrected, then Chinese characters in the image are segmented by projection method. We take each segmented character as the unit for one bit of information hiding. The parity of number of black pixels in each character represents bit hidden, \"1\" or \"0\". Then, the position of hidden information is determined according to the stroke trend of Chinese characters. For information extraction, after segmenting characters in the same way as at information hiding phase, the receiver calculates the number of black pixels in each character to obtain information. Experimental results show that the proposed algorithm has a great advantage in the imperceptibility of information hiding. In addition, the computational cost of the algorithms both in hiding information and extracting information is low.","PeriodicalId":124651,"journal":{"name":"Proceedings of the 5th International Conference on Multimedia and Image Processing","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128708436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The fault diagnosis of catenary system based on the deep learning method in the railway industry","authors":"Chenchen Huang, Yuan Zeng","doi":"10.1145/3381271.3381293","DOIUrl":"https://doi.org/10.1145/3381271.3381293","url":null,"abstract":"The catenary system plays a vital role in the railway industry, which is associated with the security and efficiency of the train operation. The fault diagnosis and anomaly detection of the catenary system is of significance. The current carrying ring and dropper are important parts of catenary and attract attention in the inspection process. Based on the image processing technique and deep learning method, the fault diagnosis method of the catenary system is presented. The fault diagnosis of catenary system consists of three parts, top current carrying ring, dropper and bottom current carrying ring detection. The feature pyramid network is applied for the various scales units of catenary system in image from inspection vehicle. Based on the modified CenterNet, the current carrying ring is detected. The results of the located rings are chosen through specific selection. Then the selected top and bottom rings are matched further through the location relationship. Based on the matched rings, the dropper is located and then classified by the classification network. According to the experiments on the plenty of catenary image datasets, it shows that the method have efficient and satisfied performance on the fault diagnosis of the catenary system.","PeriodicalId":124651,"journal":{"name":"Proceedings of the 5th International Conference on Multimedia and Image Processing","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126567969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic bounding-box-labeling method of occluded objects in virtual image data","authors":"Xinyue Wang, LingZhong Meng, Yunzhi Xue","doi":"10.1145/3381271.3381292","DOIUrl":"https://doi.org/10.1145/3381271.3381292","url":null,"abstract":"Computer vision technology is widely used based on its massive and correct data set, of which the bounding box labeling is a common method. Aimed at a large number of original image data set produced by virtual simulation, we proposed an automatic pixel-level bounding-box-labeling method to solve problem of accuracy and speed. The method starts by a fundamental algorithm based on targeted bounding box, which will be adopted to label the images produced by virtual simulation and learn from the bounding box of different objects; Next, the method will find consistent seed points and apply region growing algorithm to automatically produce binary images based on the seed points; Then, an occlusion-estimating algorithm can be used to evaluate the occluded conditions in the binary image; Finally, employ bounding-box-labeling algorithm to label targeted objects according to various occlusion. Apply the data set from 2019 Small Target Competition held by China Society of Images and Graphics to test and verify our method, the result turns out that this method can solve the occlusion problem especially the truncate occlusion and can label the objects' entire body precisely.","PeriodicalId":124651,"journal":{"name":"Proceedings of the 5th International Conference on Multimedia and Image Processing","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131038038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pavement type recognition based on deep learning","authors":"Gaojian Cui, Fanghu Ning, Xiaoguang Ren","doi":"10.1145/3381271.3381286","DOIUrl":"https://doi.org/10.1145/3381271.3381286","url":null,"abstract":"To obtain pavement type information on the basis of related theoretical knowledge on deep learning, this study constructs a deep convolutional neural network model that uses video information on ice, snow, asphalt, and cement pavements collected by a vehicle camera. The data are preprocessed to obtain training and test sets. The training set is used for neural network training, and the accuracy of the model is evaluated with the test set. Results show that the constructed network model can accurately classify four kinds of pavements with an accuracy of 99.5%.","PeriodicalId":124651,"journal":{"name":"Proceedings of the 5th International Conference on Multimedia and Image Processing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134557398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Flying point target tracking using infrared images","authors":"S. Cao, Hongyan He","doi":"10.1145/3381271.3381284","DOIUrl":"https://doi.org/10.1145/3381271.3381284","url":null,"abstract":"To improve the detection level and performance of designed infrared motion analysis system, a combined scheme is proposed to get long-term tracklet for flying small objects: 1) point object is efficiently detected via a fast background extraction, and an improved correlation filtering algorithm is utilized for possibly near object with much texture; 2) tracker is initialized and managed by estimating continuous motion using Kalman filter; 3) prior knowledge of object is further incorporated to remove false object in tracklet association process. Outdoor experiment proves the proposed techniques improve the accuracy for target objects, and it also extends the validness of our strategy for coming on orbit system.","PeriodicalId":124651,"journal":{"name":"Proceedings of the 5th International Conference on Multimedia and Image Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124928047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Perception of gender-stereotype in films: a case study on \"Captain Marvel\" superhero movie","authors":"K. T. Chau, Pau Yen Ooi, Tania Amos","doi":"10.1145/3381271.3381303","DOIUrl":"https://doi.org/10.1145/3381271.3381303","url":null,"abstract":"This paper aims to examine the perception of Malaysians on the issues related to gender-stereotypes in superhero genre movies in general, and Captain Marvel movie in specific. The paper aims to call into insight as to whether or not Malaysians accept the female superhero and relevant message delivered. Do they perceive the movie as something positive, agree with the manner and behaviour portrayed, or biased against female superheroes? This research recruited 98 respondents and quantitative analysis reveals that majority of the respondents accepted the Captain Marvel, and responded positively to the feminine power depicted in the movie.","PeriodicalId":124651,"journal":{"name":"Proceedings of the 5th International Conference on Multimedia and Image Processing","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125918066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}