{"title":"Classification of defaced occlusion plates based on convolutional neural network","authors":"Sen Zhang, Jinglei Zhang, Jie Li, Shuai Chen","doi":"10.1117/12.2574415","DOIUrl":"https://doi.org/10.1117/12.2574415","url":null,"abstract":"As one of the important components of intelligent transportation, license plate recognition plays an irreplaceable role in people's daily life. For example, illegal vehicles often escape from punishment because of the number plate defacement or intentional occlusion, which further increases the difficulty of law enforcement. Therefore, it is significant for automatic recognition system to improve the identification efficiency of the contaminated or occluded license plate. This paper mainly focuses on the recognition of occlusion number plate. License plates can be divided into four categories: normal number plate, partial occlusion number plate, complete occlusion number plate and unsuspended number plate. The traditional OCR algorithm has a high accuracy in the recognition of Chinese characters, characters and numbers. Although the detection of normal and partial occlusion plates also shows a good recognition in the case of OCR, the recognition of complete occlusion and unsuspended license plates is still very poor. With the development of artificial intelligence, it is possible to identify all the sheltered and unsuspended plates better. Combining with the advantages of traditional algorithms, this paper uses traditional OCR and current deep learning algorithm to optimize the recognition effect of stained license plate.","PeriodicalId":90079,"journal":{"name":"... International Workshop on Pattern Recognition in NeuroImaging. International Workshop on Pattern Recognition in NeuroImaging","volume":"24 1","pages":"1152605 - 1152605-7"},"PeriodicalIF":0.0,"publicationDate":"2020-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75182311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Liyan Chen, Xiaojie Xie, Lu Lin, Beizhan Wang, Weiqiang Lin
{"title":"Research on smart navigation system based on AR technology","authors":"Liyan Chen, Xiaojie Xie, Lu Lin, Beizhan Wang, Weiqiang Lin","doi":"10.1117/12.2574673","DOIUrl":"https://doi.org/10.1117/12.2574673","url":null,"abstract":"The system uses the Unity3D engine to develop the Android app, and develop AR technology modules with Vuforia toolkit, integrating geographic information service technology and panorama technology. We combined the two tracking and registration methods which are based on sensors and natural images features, to implement a tourism navigation and AR introduction system based on Kulangsu, a famous scenic spot in Xiamen. The system is mainly divided into two modules, the route navigation module and the scenic spots guide module.","PeriodicalId":90079,"journal":{"name":"... International Workshop on Pattern Recognition in NeuroImaging. International Workshop on Pattern Recognition in NeuroImaging","volume":"89 1","pages":"115260J - 115260J-7"},"PeriodicalIF":0.0,"publicationDate":"2020-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75199393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A discernible criterion for 3D point cloud based on multifractal spectrum","authors":"Kun Yu, Jie Ma, Bin Fang, Bingli Wu","doi":"10.1117/12.2574409","DOIUrl":"https://doi.org/10.1117/12.2574409","url":null,"abstract":"The traditional discernible criteria for a 2D target are mostly based on Johnson criterion, to overcome the limitations of the Johnson criterion and fill the gap in a 3D point cloud, a novel discernible criterion has been proposed for the 3D point cloud. Based on the multifractal spectrum, the spatial distribution of the 3D point cloud is described. By analyzing the multifractal spectra at different resolutions, feature trend and the final discernible resolution are concluded. The experimental results show that the limiting resolution of T90, F15C is 585mm, the limiting resolution of T90 and Rexton is 517mm, and the limiting resolution of F15C and Rexton is 541mm. The proposed discernible criteria can provide theoretical support for limit identification resolution of 3D point cloud target.","PeriodicalId":90079,"journal":{"name":"... International Workshop on Pattern Recognition in NeuroImaging. International Workshop on Pattern Recognition in NeuroImaging","volume":"184 1","pages":"1152602 - 1152602-5"},"PeriodicalIF":0.0,"publicationDate":"2020-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88968120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving visual question answering with pre-trained language modeling","authors":"Yue Wu, Huiyi Gao, Lei Chen","doi":"10.1117/12.2574575","DOIUrl":"https://doi.org/10.1117/12.2574575","url":null,"abstract":"Visual question answering is a task of significant importance for research in artificial intelligence. However, most studies often use simple gated recurrent units (GRU) to extract question or image high-level features, and it is not enough for achieving a better performance. In this paper, two improvements are proposed to a general VQA model based on the dynamic memory network (DMN). We initialize the question module of our model using the pre-trained language model. On the other hand, we utilize a new module to replace GRU in the input fusion layer of the input module. Experimental results demonstrate the effectiveness of our method with the improvement of 1.52% on the Visual Question Answering V2 dataset over baseline.","PeriodicalId":90079,"journal":{"name":"... International Workshop on Pattern Recognition in NeuroImaging. International Workshop on Pattern Recognition in NeuroImaging","volume":"18 1","pages":"115260D - 115260D-5"},"PeriodicalIF":0.0,"publicationDate":"2020-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85702964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abdullah Azeem, Waqar Riaz, Abubakar Siddique, Tahir Junaid Saifullah
{"title":"Human video synthesis using generative adversarial networks","authors":"Abdullah Azeem, Waqar Riaz, Abubakar Siddique, Tahir Junaid Saifullah","doi":"10.1117/12.2574615","DOIUrl":"https://doi.org/10.1117/12.2574615","url":null,"abstract":"In this work, a video synthesis model based on Generative Adversarial Networks (Human GAN) is proposed, whose objective is to generate a photorealistic output by learning the mapping function from an input source to output video. However, the image to image generation is a quite popular problem, but the video synthesis problem is still unexplored. Directly employing existing image generation method without taking temporal dynamics into account leads to frequent temporally incoherent output with low visual quality. The proposed approach solves this problem by wisely designing generators and discriminators combined with Spatio-temporal adversarial objects. While comparing it to some robust baselines on public benchmarks, the proposed model proves to be superior in generating temporally coherent videos with extremely low artifacts. And results achieved by the proposed model are more realistic on both quantitative and qualitative measures compared to other existing baselines techniques.","PeriodicalId":90079,"journal":{"name":"... International Workshop on Pattern Recognition in NeuroImaging. International Workshop on Pattern Recognition in NeuroImaging","volume":"415 14","pages":"115260I - 115260I-5"},"PeriodicalIF":0.0,"publicationDate":"2020-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72506658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A novel graph matching method based on the local and global distance information of the graph nodes","authors":"Zhaoning Yin, Chunyu Zhao, Dongmei Niu, Xiuyang Zhao","doi":"10.1117/12.2574410","DOIUrl":"https://doi.org/10.1117/12.2574410","url":null,"abstract":"Graph matching is a classical NP-hard problem, and it plays an important role in many applications in computer science. In this paper, we propose an approximate graph matching method. For two graphs to be matched, our method first constructs an association graph with nodes representing the candidate correspondences between the two original graphs. It then constructs an affinity matrix based on the local and global distance information between the original graphs’ nodes. Each element of the matrix represents the mutual consistency of a pair of nodes of the association graph. After simulating random walks on the association graph, a stable quasi-stationary distribution is obtained. With the Hungarian algorithm, our method finally discretizes the distribution to achieve an approximate matching between the two original graphs. Experiments on two commonly used datasets demonstrate the effectiveness of our method on graph matching.","PeriodicalId":90079,"journal":{"name":"... International Workshop on Pattern Recognition in NeuroImaging. International Workshop on Pattern Recognition in NeuroImaging","volume":"33 1","pages":"1152603 - 1152603-5"},"PeriodicalIF":0.0,"publicationDate":"2020-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78614645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A framework for multimodal sign language recognition under small sample based on key-frame sampling","authors":"Jianyu Wang, Jianxin Chen, Yi-Yu Cai","doi":"10.1117/12.2574424","DOIUrl":"https://doi.org/10.1117/12.2574424","url":null,"abstract":"Sign language recognition is challenging, due to the scarcity of available annotated corpora and the difficulty of large vocabulary. In this paper, we study the task based on a Chinese SL database-DEVISIGN, but it only has a few samples to train the deep network on the scratch. First, we segment the hand to eliminate the disturbance of irrelevant factors. By analyzing the special movement tendency of sign words, we propose two novel Key-frame selection schemes. Since no other datasets can have similar data distribution with our preprocessed data, we invent a novel cross-sampling approach, which successfully prevent the overfitting under small sample. To enhance the diversity of data, we take several samplingbased videos as input, and learn spatiotemporal features based on R(2+1)D-18 layers, which is successful in action recognition tasks. Finally, it is shown that our solution can obtain the state-of-the-art performance.","PeriodicalId":90079,"journal":{"name":"... International Workshop on Pattern Recognition in NeuroImaging. International Workshop on Pattern Recognition in NeuroImaging","volume":"66 1","pages":"115260A - 115260A-7"},"PeriodicalIF":0.0,"publicationDate":"2020-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76998664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Feature detection based on linear prediction residual for spoofing countermeasures of speaker verification system","authors":"Min Chen, Yibiao Yu","doi":"10.1117/12.2574590","DOIUrl":"https://doi.org/10.1117/12.2574590","url":null,"abstract":"The pre-research shows that Linear prediction (LP) residual contains more discriminative information related to replay spoofing attacks, so this paper proposes three features based on LP residual and IMel filter-banks which closely distributed in the high-frequency regions for replay spoofing countermeasures. They are residual IMel frequency cepstral coefficient (RIMFC), LP residual Hilbert envelope IMel frequency cepstral coefficient (LHIMFC) and residual phase cepstral coefficient (RPC). The effectiveness of these features is demonstrated on ASVspoofing2017 Challenge Version 2.0 dataset. Experimental results indicate that the proposed features outperform the baseline system using constant Q cepstral coefficient (CQCC), and the equal error rate (EER) is reduced under the same conditions. Moreover, feature fusions help to achieve higher performance than traditional IMel frequency cepstral coefficient (IMFCC) and CQCC, which indicates that the complementary information of different features is beneficial for detecting replay attacks.","PeriodicalId":90079,"journal":{"name":"... International Workshop on Pattern Recognition in NeuroImaging. International Workshop on Pattern Recognition in NeuroImaging","volume":"58 1","pages":"115260E - 115260E-5"},"PeriodicalIF":0.0,"publicationDate":"2020-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89929910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improved hierarchical models for non-native Chinese handwriting recognition using hidden conditional random fields","authors":"Hao Bai, Xiwen Zhang","doi":"10.1117/12.2574420","DOIUrl":"https://doi.org/10.1117/12.2574420","url":null,"abstract":"Hierarchical models with HMM has the advantage of recognizing Chinese characters in digital ink from non-native language writers. However, the recognition performance has been limited by the attribute of generative model of HMM. In this paper, we apply Hidden Conditional Random Field to improve the performance of hierarchical models. First, strokes in one Chinese character are classified with HCRF and then concatenated to the stroke symbol sequence. In the meantime, the structure of components in one ink character is extracted. According to the extraction result and the stroke symbol sequence, candidate characters are traversed and scored. Finally, the recognition candidate results are listed by descending. The approach proposed is validated by testing 19815 copies of the handwriting Chinese characters written by foreign students.","PeriodicalId":90079,"journal":{"name":"... International Workshop on Pattern Recognition in NeuroImaging. International Workshop on Pattern Recognition in NeuroImaging","volume":"23 1","pages":"1152609 - 1152609-5"},"PeriodicalIF":0.0,"publicationDate":"2020-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80594711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wenlong Li, Wentao Wang, Wei Cheng, H. Ge, Shuo Jin, Yuan Ren, Xinqiang Ma
{"title":"Research on camera calibration method based on coplanar points","authors":"Wenlong Li, Wentao Wang, Wei Cheng, H. Ge, Shuo Jin, Yuan Ren, Xinqiang Ma","doi":"10.1117/12.2574591","DOIUrl":"https://doi.org/10.1117/12.2574591","url":null,"abstract":"In order to achieve the camera calibration, the calculation process of the camera’s internal and external parameters was obtained through the established camera calibration model. Based on the coplanar points, the camera calibration model was simplified. With distortion model and Levenberg-Marquardt algorithm, the system calibration’s accuracy was improved. The experimental results showed that the calibration error was smaller and the error data was more concentrated, which realized the accurate calibration of the camera.","PeriodicalId":90079,"journal":{"name":"... International Workshop on Pattern Recognition in NeuroImaging. International Workshop on Pattern Recognition in NeuroImaging","volume":"19 1","pages":"115260F - 115260F-7"},"PeriodicalIF":0.0,"publicationDate":"2020-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87812305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}