Allen Joshey, Ashish Tiwari, Rakesh Sankar, Sahil Salim Makandar
{"title":"A Deep Learning model capable of producing heatmap probabilities for Characters in Natural Scenes.","authors":"Allen Joshey, Ashish Tiwari, Rakesh Sankar, Sahil Salim Makandar","doi":"10.1145/3480651.3480662","DOIUrl":"https://doi.org/10.1145/3480651.3480662","url":null,"abstract":"Text appearing in Natural settings come in all shapes, sizes and textures. Classical methods have often failed at extracting accurately the text present in naturally occurring scenes. Text appearing in the wild presents itself in forms of hierarchy organized as sentences, words and characters. Methods for detecting Text from everyday scenes of the real world have found success. Most real world datasets available are annotated on a word level or line level thereby limiting detection to words and not characters. Inspired by the works of Naver Labs on CRAFT [2] and Microsoft Research and Baidu Research's work on WordSup [5] by training models in a weakly supervised manner to gain character level predictions. We propose a computationally efficient architecture capable of providing similar results. Thus our model, once capable of producing character level annotation trained on Synthetic text can be used to fine tune for text appearing in natural settings. The methods discussed prove to be robust enough to identify text that could be curved or somewhat deformed appearing in natural settings. Our approach includes the generation of probabilities of the location of characters and the gaps between characters of which constitute a word, such that it becomes easier to localize characters and words. Our method goes to show comparable results as to CRAFT [2] with only 30% of the number of learnable parameters required.","PeriodicalId":305943,"journal":{"name":"Proceedings of the 2021 International Conference on Pattern Recognition and Intelligent Systems","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127743281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohamed Lamine Rouali, Said Yacine Boulahia, Abdenour Amamra
{"title":"Simultaneous temporal and spatial deep attention for imaged skeleton-based action recognition","authors":"Mohamed Lamine Rouali, Said Yacine Boulahia, Abdenour Amamra","doi":"10.1145/3480651.3480668","DOIUrl":"https://doi.org/10.1145/3480651.3480668","url":null,"abstract":"The use of skeletons as a modality to represent and recognize human actions has gained interest thanks to the compactness of the data, their reliable representativeness in addition to their strong robustness. The deep learning based recognition approaches which are based on it often propose to improve the recognition pipeline by integrating the concept of attention in their modeling. The idea is to allow the model to focus on the relevant information of the action instead of attempting some kind of blind modeling. In this article, we propose an action recognition approach integrating simultaneously both spatial and temporal attentions. We first perform a transformation of the input sequence data into a color matrix, called imaged skeleton, comprising Cartesian and rotational information. Then, this new representation is given as input to an architecture composed of a main trunk, that allows features extraction and classification, and several attention branches. Different experimental evaluations on two popular benchmark databases, namely UT-Kinect [1] and SBU Kinect Interaction [2], are conducted to verify the interest of our proposed approach, where better performances are reported. Index: convolutional neural network, spatio-temporal, skeleton-based action recognition, deep attention.","PeriodicalId":305943,"journal":{"name":"Proceedings of the 2021 International Conference on Pattern Recognition and Intelligent Systems","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121199136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cell Detection by Robust Self-Trained Networks","authors":"Yuang Zhu, Yuxin Zheng, Zhao Chen","doi":"10.1145/3480651.3480665","DOIUrl":"https://doi.org/10.1145/3480651.3480665","url":null,"abstract":"Cell nuclear detection on digital histopathology images plays an important role on computer-assisted cancer diagnostics. However, lack of manual annotations and variability of cells bring great challenges to fully-supervised learning. Therefore, we propose a Robust Self-Trained Network (RSTN) for cell detection. The backbone is an encoder-decoder trained by distance maps (DMs) generated from dot annotations of nuclei. To save manual efforts, RSTN is designed to involve reliable predicted DMs in optimization and detect cell centers for unknown images automatically. RSTN gains robustness by regularizing the network by dynamic graphs of DM patches. It exploits underlying graph structures and recognizes complex spatial patterns to locate cells of various shapes and colors. Experimental results show that it outperforms several classic and advanced models on both simulated fluorescence microscope images and real pathology slides for cell detection.","PeriodicalId":305943,"journal":{"name":"Proceedings of the 2021 International Conference on Pattern Recognition and Intelligent Systems","volume":"34 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115737034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Face anti-spoofing by using Feature Fusion","authors":"Qiong Liu, Lan Zhang","doi":"10.1145/3480651.3480658","DOIUrl":"https://doi.org/10.1145/3480651.3480658","url":null,"abstract":"Security issues have attracted more and more attention with the application of face recognition technology. Face anti-spoofing has become an important derivative subject in the current research of face recognition, More and more researchers are devoted to the research of face anti-spoofing by the increase of fake face types. Even face anti-spoofing has been studied from different angles, some of the methods have significant effects, while there are still some deficiencies and can not get satisfactory results, How to effectively distinguish between real faces and fake faces and improve the accuracy of classification is the focus of current research.","PeriodicalId":305943,"journal":{"name":"Proceedings of the 2021 International Conference on Pattern Recognition and Intelligent Systems","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128264948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Food safety pre-warning system based on Robust Principal Component Analysis and Improved Apriori Algorithm","authors":"Xiaowen Ding, Sheng Xu","doi":"10.1145/3480651.3480653","DOIUrl":"https://doi.org/10.1145/3480651.3480653","url":null,"abstract":"In response to the frequent food safety incidents in recent years, a risk pre-warning system for food supply chain is proposed to ensure the food quality, This papers builds the food security information pre-warning system use association rules mining technology against the security problems of food production and processing, Monitor the detection data timely and give pre-warn automatically in the whole supply chain. we combines a Robust Principal Component Analysis (RPCA) to obtain better clustering performance and an improved Apriori algorithm to reduces the memory consumption and I/O operations and to shortens the running time. We study of a case of meat producer and the results shows the proposed pre-warning method can identify safety risks efficiently and report the exact warning, when an abnormality is detected by the expert analysis. Experiments verify the correctness of the model and the effectiveness of the algorithm.","PeriodicalId":305943,"journal":{"name":"Proceedings of the 2021 International Conference on Pattern Recognition and Intelligent Systems","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114293138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. M. S. Uddin, Md. Samin Morshed, Mahruf Islam Prottoy, A. Rahman
{"title":"Age Estimation from Facial Images using Transfer Learning and K-fold Cross-Validation","authors":"S. M. S. Uddin, Md. Samin Morshed, Mahruf Islam Prottoy, A. Rahman","doi":"10.1145/3480651.3480659","DOIUrl":"https://doi.org/10.1145/3480651.3480659","url":null,"abstract":"Automatic Age estimation has gained more and more interest in recent years due to its potential in many applications. Most techniques uses hand-crafted features to predict aging patterns, but not accurate enough to be employed effectively. Recent advances in deeply learned features extracted by Convolutional Neural Network (CNN) allows to design more accurate facial analysis. The aim of this paper is to explore the performance of different age estimation techniques that uses Deep Learning methods and to propose a variation of Transfer learning which uses K-fold cross validation on top of transfer learning. The experiment was carried out with UTKFace dataset using VGG16, ResNet50 and SENet50 models. The result demonstrates that our method is Superior to the existing methods in terms of performance.","PeriodicalId":305943,"journal":{"name":"Proceedings of the 2021 International Conference on Pattern Recognition and Intelligent Systems","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117136967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Application of Wavelet Analysis in Image Matching","authors":"Linglong Tan, Fengzhi Wu, Xiaoyao Yin, Song Xue","doi":"10.1145/3480651.3480670","DOIUrl":"https://doi.org/10.1145/3480651.3480670","url":null,"abstract":"Abstract. Based on the study of traditional matching methods, this paper implements a low-frequency image matching system based on wavelet transform, which is composed of wavelet preprocessing, low-frequency image extraction, and image matching. The low-frequency image after wavelet decomposition is used for matching, which can reduce the calculation time of matching. The low-frequency image still contains most of the visual information of the original image, making the matching result stable and reliable.In this system, image wavelet decomposition and matching use mature and fast algorithms. The matching is performed on low-frequency images, which makes the amount of calculation for matching very small. Using the low-frequency components of the image to match also greatly removes the interference of noise on the image matching. Since the highest proportion of high-frequency noise in the noise has been removed before the algorithm is matched, all the matching algorithms have good anti-noise ability.The matching system in this paper adopts a matching method based on low-frequency components after wavelet transform, discusses and realizes the use of low-frequency images after image wavelet decomposition to perform image matching. The experimental results show that the matching algorithm used in the article has fast calculation speed, less matching time, and certain practicability.","PeriodicalId":305943,"journal":{"name":"Proceedings of the 2021 International Conference on Pattern Recognition and Intelligent Systems","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133317960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design and Implementation of the Wood Knot Recognition System Based on Matlab GUI","authors":"Xiaoxia Yang, Xin Gao, Tao Wang, Kepeng Yang, Chengxin Hu, Xiaoping Liu, Yucheng Zhou","doi":"10.1145/3480651.3480664","DOIUrl":"https://doi.org/10.1145/3480651.3480664","url":null,"abstract":"Wood knots are scaring formed by the death of branches in the trunk, which has a great influence on the quality of logs. To accurately detect the shape and area of wood knots is of great significance to the production and processing of wood. Wood images with live knots, dead knots, rotten knots and multiple knots are processed with the gray level transformation, Gaussian filtering which are carried out successively. Otsu algorithm is used for image thresholding, followed by erosion and dilation, remove connected domains, edge markers, number of knots, roundness detection, the ratio of the areas of the knots and whole image. The results show that the contour of knots can be detected accurately after processing. The number of knots, the roundness and area of knots are determined. The algorithms mentioned above are embedded into a Graphical User Interface(GUI) by using Matlab callback functions.","PeriodicalId":305943,"journal":{"name":"Proceedings of the 2021 International Conference on Pattern Recognition and Intelligent Systems","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127224392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Synthetic Aperture Radar image target recognition based on hybrid attention mechanism","authors":"Baodai Shi, Qin Zhang, Yao Li","doi":"10.1145/3480651.3480660","DOIUrl":"https://doi.org/10.1145/3480651.3480660","url":null,"abstract":"Deep learning algorithm has been more and more applied in the image field, but its application in the SAR image target recognition field is still faced with some problems, such as poor instantaneity and low precision. On this basis, this paper puts forward a convolutional neural network algorithm based on hybrid attention mechanism . The basic module of this model is composed of the trunk branch and the soft branch. The trunk branch composed of the residual shrinkage network and the improved channel attention mechanism is responsible for extracting the main characteristics. Soft branch composed of up sampling and down sampling is responsible for extracting the mixed attention weight, which can enhance the mapping capacity from input to output. The recognition rate of MSTAR dataset with this model is 99.6%. According to noise analysis, this model is of strong robustness for images with impulse noise added .","PeriodicalId":305943,"journal":{"name":"Proceedings of the 2021 International Conference on Pattern Recognition and Intelligent Systems","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115168047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}