Nurbiya Xamxidin, Mahpirat Mamat, Wenxiong Kang, A. Aysa, K. Ubul
{"title":"Off Line Handwritten Signature Verification Based on Feature Fusion","authors":"Nurbiya Xamxidin, Mahpirat Mamat, Wenxiong Kang, A. Aysa, K. Ubul","doi":"10.1109/PRML52754.2021.9520737","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520737","url":null,"abstract":"At present most of the research on offline handwritten signature is based on a single language and the problems of the sparse signature image, weak feature representation ability and low verification rate have not been well solved. In this paper, the off-line handwritten signature images of two different languages including Chinese and Kazakh are used as experimental data. the experimental results show that even a small amount of training data. The accuracy rate of this paper in multi-lingual off-line handwritten signature verification can still reach 96.74% compared with related work the verification effect of this method is higher.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127610339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Miao Jin, Jun Zhang, Tianfu Huang, Zhiwei Guo, Xiwen Chen
{"title":"Research on Human Action Recognition Based on Global-Local Features of Video","authors":"Miao Jin, Jun Zhang, Tianfu Huang, Zhiwei Guo, Xiwen Chen","doi":"10.1109/PRML52754.2021.9520743","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520743","url":null,"abstract":"In the research of human behavior recognition, the two-stream network structure shows excellent results. Aiming at the branch feature of two-stream networks, this paper proposes a two-stream human behavior research method based on global-local features. This method first uses a mixture of Gaussian background modeling methods to extract silhouette features as global contour features, and then uses an end-to-end learnable unsupervised network TV-Net to generate optical flow motion features, which are used as the network input, and the Xception network is used as The feature generation network which does not change the model scale while improving the accuracy, and performs fusion classification on the output of the two-stream branch network to obtain the behavior recognition result. This method refines the motion information contained in the global and local features for classification, reduces the computational complexity, and shows a good level of recognition on both public and internal data sets.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116430954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaoyu Zhang, Xu Ni, Y. Deng, Changyu Jiang, Mina Maleki
{"title":"Chinese License Plate Recognition Using Machine and Deep Learning Models","authors":"Xiaoyu Zhang, Xu Ni, Y. Deng, Changyu Jiang, Mina Maleki","doi":"10.1109/PRML52754.2021.9520386","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520386","url":null,"abstract":"The license plate detection and recognition (LPDR) system is one of the practical applications of optical character recognition (OCR) technology in the field of automobile transportation. This paper investigates several state-of-the-art machine and deep learning algorithms for the Chinese license plate recognition based on convolutional neural networks (CNN), long short term memory (LSTM), and k-nearest neighbors (KNN) models. Comparing the performance of these models on the Chinese City Parking Dataset (CCPD) demonstrates that the convolutional recurrent neural network (CRNN) model with an accuracy of 95% is the most accurate and performs better than other models to detect the license plates.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121678136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Leonardo Capozzi, P. Carvalho, Afonso Sousa, C. Pinto, João Ribeiro Pinto, Jaime S. Cardoso
{"title":"Impact of Visual Noise in Activity Recognition Using Deep Neural Networks - An Experimental Approach","authors":"Leonardo Capozzi, P. Carvalho, Afonso Sousa, C. Pinto, João Ribeiro Pinto, Jaime S. Cardoso","doi":"10.1109/PRML52754.2021.9520734","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520734","url":null,"abstract":"The popularity of deep learning methods has increased significantly, in no small part due to their impressive performance in several application scenarios. This paper focuses on recognising activities in an in-vehicle environment and measuring the impact that factors such as resolution, aspect ratio, field of view and framerate have on the performance of the model. The use of deep learning methodologies in recent years has increased the amount of data required to train and test the models. However, such data is often insufficient, unavailable, or lacks suitable properties. Publicly available action recognition datasets have been analysed, collected, and prepared to assess the classification results in such scenarios, which provides important guidance for use in a real-world setting.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126846504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Depth of Anesthesia Monitoring Method Based on EEG Microstate Analysis and Hidden Markov Model","authors":"Lichengxi Si, Zhian Liu, G. Wang","doi":"10.1109/PRML52754.2021.9520709","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520709","url":null,"abstract":"Electroencephalogram (EEG) microstate analysis is an important emerging method that can classify continuous multichannel EEG signals into a limited number of microstates through clustering. Microstate analysis combines the time and space information of EEG, which can reflect important transformation process of high-level cognitive functions in the brain. In recent years, Microstate analysis has made great progress in the research of depth of anesthesia (DOA) monitoring. In this paper, a new DOA monitoring algorithm is designed by combining microstate sequence and hidden Markov model (HMM). The trained Hidden Markov Model shows the information of brain nerve activity hidden in the microstate sequence, which can effectively distinguish the mental states of different DOAs, thereby realizing the corresponding DOA classification. The experimental dataset was obtained from an open-access section of the University of Cambridge Data Repository, which contains EEG data from 20 healthy subjects. During propofol injection, the brain states of the subjects were divided into four conditions: baseline (BS), mild sedation (ML), moderate sedation (MD), and the recovery stage (RC). The algorithm classified BS and ML, BS and MD, ML and MD with the accuracy rates of 71.40%, 73.48%, 67.75% respectively. This shows that the microstate analysis has great application potential in the study of anesthesia. Hidden Markov model training for microstate sequences can become a new research direction for DOA monitoring.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126912894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Artificial Intelligence Use in Human Resources Management: Strategy and Operation’s Impact","authors":"S. Achchab, Yassine Khallouk Temsamani","doi":"10.1109/PRML52754.2021.9520719","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520719","url":null,"abstract":"HR technology leaders foresee AI’s growing role in a variety of areas, such as aiding recruitment, improving compliance, augmenting training, streamlining onboarding and more. New artificial intelligence technologies that automate and augment the workforce could be the key to solving some of the thorny issues and increased demands for HR to accomplish more with fewer resources. The article presents the application of artificial intelligence in Human Resources, the challenges and how to use AI to support and develop a successful workforce.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128385543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Language Identification Research Based on Dual Attention Mechanism","authors":"Mijit Ablimit, Ma Xueli, A. Hamdulla","doi":"10.1109/PRML52754.2021.9520699","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520699","url":null,"abstract":"Language identification(LID) is an important branch of speech technology. A key problem of language identification is how to extract effective speech segment representation from a given speech and improve the model performance. In recent years, deep learning has made significant progress in the application of language identification. Neural networks can be used to extract relevant features and effectively improve system performance. In order to solve the problem of poor feature extraction ability and low recognition rate, this paper considers both features and models, through the comparison of features such as MFCC, Fbank to determine spectrogram as the best input feature, and proposes a language identification method based on dual attention mechanism. This method first takes the spectrogram of the speech spectrogram, and converts it into a gray-scale spectrogram as input, uses a multi-level convolutional neural network to capture local features, extracts dual attention in channel and spatial dimension of the feature map through the CBAM module, catches temporal characteristics with bidirectional gated recurrent units, then transfers the local characteristics and timing characteristics jointly to a fully connected layer, and uses the fully connected layer to output language classes. This paper conducts experiments on the Common voice dataset and AP17-OLR dataset, it demonstrates that dual attention mechanism’s language identification method can achieve good results, increase the feature extraction ability and improve the performance of language identification.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134016827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Edwin Kwadwo Tenagyei, Zongbo Hao, Kwadwo Kusi, K. Sarpong
{"title":"Robust Real-Time Human Action Detection through the Fusion of 3D and 2D CNN","authors":"Edwin Kwadwo Tenagyei, Zongbo Hao, Kwadwo Kusi, K. Sarpong","doi":"10.1109/PRML52754.2021.9520696","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520696","url":null,"abstract":"Recent approaches for human action detection often rely on appearance and optical flow networks for frame-level detections before linking them to form action tubes. However, they achieve unsatisfactory performance in real-time due to their huge computational complexity and large parameter usage during training. In this paper, we design and implement a unified end-to-end convolutional neural network (CNN) architecture that consists of two branches, extracting both spatial and temporal information concurrently before predicting bounding boxes and action probabilities from video clips. We also design a novel mechanism that exploits the inter-channel dependencies for an effective fusion of features from the branches. Specifically, we propose a Channel Fusion and Relation-Global Attention (CFRGA) module to aggregate the two features smoothly and model their inter-channel dependencies by considering their global scope structural relation information when inferring attention. We conduct experiments on the untrimmed video dataset, UCF101-24, and achieved impressive results in frame-mAP and video-mAP. The experimental results show that our channel fusion and relation-global attention module contributes to its good performance.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134499337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Novel Methodology for Automating Spatio-Temporal Data Classification in Basketball Using Active Learning","authors":"Shaojun Ai, Jiaming Na, V. D. Silva, M. Caine","doi":"10.1109/PRML52754.2021.9520715","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520715","url":null,"abstract":"The use of machine learning on spatio-temporal datasets has generated significant interest in a range of applications, including vehicular traffic modelling and urban planning. One of the most prolific application domains is sports analytics due to the availability of real-world multi-agent datasets, where such techniques are used to recognize and predict offensive and defensive strategies in a range of team sports. However, the use of advanced machine learning techniques requires the large datasets to be annotated by domain experts, which is a time-consuming task. Active learning is a methodology that significantly cuts down the data-annotation time on large datasets. In this paper, we investigate active learning strategies to annotate spatio-temporal datasets for the purpose of classification model building. The proposed algorithms are demonstrated on a dataset obtained from professional basketball games to classify an offensive strategy known as ‘Pick-and-Roll’. Several neural network architectures are investigated for the classification of more than 900 segments of basketball plays. The results obtained suggest that the proposed, preferred, methodology is well suited for annotating large spatio-temporal datasets and has the potential to be applicable across a range of team sports and non-sports usage scenarios.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130806372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SFTRLS-Based Speech Enhancement Method Using CNN to Determine the Noise Type and the Optimal Forgetting Factor","authors":"De-You Tang, Guoqiang Chen","doi":"10.1109/PRML52754.2021.9520741","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520741","url":null,"abstract":"This paper presents a speech enhancement method combining the convolutional neural network (CNN) and SFTRLS, SFTRLS-CNN, which consists of two tiers of CNN to customize parameters for the SFTRLS algorithm. The first CNN identifies noise type, and the second CNN matches the best forgetting factor. The experimental results show that the noise recognition rate of SFTRLS-CNN goes up to 99.97% and displays better performance than the k-nearest neighbor (KNN) and the support vector machine (SVM). The accuracy ratio of matching the best forgetting factor for the SFTRLS is up to 99.40%. The improvement of the perceptual evaluation of speech quality (PESQ) is 23%, and the decrease of log-spectral distortion (LSD) is 4% on average. SFTRLS-CNN also improves the SNR of all speeches significantly.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116859867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}