Ershat Arkin, Nurbiya Yadikar, Yusnur Muhtar, K. Ubul
{"title":"A Survey of Object Detection Based on CNN and Transformer","authors":"Ershat Arkin, Nurbiya Yadikar, Yusnur Muhtar, K. Ubul","doi":"10.1109/PRML52754.2021.9520732","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520732","url":null,"abstract":"The task of object detection is to find all the objects of interest in the image, and to determine their classifications and positions, which is one of the core problems in the field of computer vision. Since the emergence of AlexNet, convolutional neural networks have an absolute position in the field of computer vision, and the research on convolutional neural networks and algorithm structures has become more and more in-depth. Object detection algorithms can be roughly divided into two categories: candidate-based(two stage) and regression-based(one stage). The object detection algorithm based on the candidate area has high accuracy, but the structure is complex and the detection speed is slow. The regression-based object detection algorithm has a simple structure and fast detection speed. It has high application value in the field of real-time object detection, but the detection accuracy is relatively low. With the pursuit of the speed and accuracy of object detection, researchers try to apply mainstream methods in different fields. Therefore, recently Transformers in the NLP field has been used in computer vision, such as ViT, Swin Transformer, etc. It showed transformer-based models perform similar to or better than neural network algorithms, and pointed out new paths for researchers. This paper introduces classic neural networks, discusses the advantages and disadvantages of convolutional neural networks used in object detection algorithms, and introduces the latest innovative methods of Transformer used in computer vision. Finally, the difficulties, challenges and future development of convolutional neural networks and Transformers in object detection are considered.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114599606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuang Pan, Zihui Xie, Xianao Yang, G. Lin, Yulian Jiang
{"title":"Intelligent Robot for Cleaning Garbage Based on OpenCV","authors":"Shuang Pan, Zihui Xie, Xianao Yang, G. Lin, Yulian Jiang","doi":"10.1109/PRML52754.2021.9520722","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520722","url":null,"abstract":"In order to clean the garbage effectively in small areas, such as communities, gardens and squares, and save the cost of garbage cleaning, this paper developed an intelligent robot, which can clean garbage independently outdoors. The vacuum cleaner on the chassis of the robot can inhale small garbage, and the flexible manipulator can grab big garbage. The robot identifies garbage based on OpenCV. To improve the accuracy of garbage recognition, the method of edge detection and contour detection is used in the process of image recognition. The test of cleaning efficiency based on the Yolo model shows that the recognition accuracy of the robot is 95%. It can clean garbage well.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"15 1-2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114111863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhijun Gao, Jian Wang, Xingle Wang, Xichao Dong, Yi Li
{"title":"A Review of Segmentation and Classification for Retinal Optical Coherence Tomography Images","authors":"Zhijun Gao, Jian Wang, Xingle Wang, Xichao Dong, Yi Li","doi":"10.1109/PRML52754.2021.9520706","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520706","url":null,"abstract":"Optical coherence tomography (OCT) is one of the important auxiliary tools for ophthalmologists to screen and diagnose human retinal diseases. In this paper, from three aspects: the segmentation method of retinal OCT image macular edema, the segmentation method of retinal layer and the classification method of retinal macular degeneration. Firstly, the current representative research methods are classified, and then summarized and discussed. Finally, the current problems in OCT medical image processing are briefly summarized and prospected.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114715456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Effects of Pre-processing on the Performance of Transfer Learning Based Person Detection in Thermal Images","authors":"Noor Ul Huda, Rikke Gade, T. Moeslund","doi":"10.1109/PRML52754.2021.9520729","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520729","url":null,"abstract":"Thermal images have the property of identifying objects even in low light conditions. However, person detection in thermal is tricky, due to varying person representations depending upon the surrounding temperature. Three major polarities are commonly observed in these representations i.e., 1. person warmer than the background, 2. person colder than the background and 3. person’s body temperature is similar to background. In this work, we have studied and analyzed the performance of the detection network by using the data in its original form and by harmonizing the person representation in two ways i.e., dark persons in the light background and light persons in a darker background. The data passed to each testing scenario was first pre-processed using histogram stretching to enhance the contrast. The work also presents the method to separate the three kinds of images from thermal data. The analysis is performed on publicly available outdoor AAUPD-T and OSU-T datasets. Precision, recall, and F1 score is used to evaluate network performance. The results have shown that network performance is not enhanced by performing the mentioned pre-processing. Best results are obtained by using the data in its original form.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127698695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using Denoised LSTM Network for Tourist Arrivals Prediction","authors":"Junke Wang, Peng Ge, Zhusheng Liu","doi":"10.1109/PRML52754.2021.9520695","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520695","url":null,"abstract":"Precise tourist arrivals prediction is required since tourism products are perishable and vulnerable to environmental change. Many studies have been pursuing more effective techniques to forecast tourist arrivals after the worldwide COVID-19. A hybrid method based on singular spectrum analysis (SSA) and long short-term memory network (LSTM) that incorporates various varieties of time series, containing historical tourist arrivals and search intensity indices (SII), is proposed to make tourist arrivals predictions. The proposed method is applied to the empirical studies and its results outperform all baseline models which verifies the effectiveness of the denoised deep learning method for high-frequency predictions. In addition, experimental results on independent SII variables reveal that SII data is of great significance to tourist arrivals predictions and provides practitioners with deeper comprehension of potential tourism forecasting factors.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132543804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Efficient Co-location Pattern Approximation Algorithm Based on Clustering Branches","authors":"Duan Duanping","doi":"10.1109/PRML52754.2021.9520713","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520713","url":null,"abstract":"The spatial co-location pattern represents a set of spatial features, whose instances are frequently associated in the space. However, due to the exponential time complexity of the traditional algorithm, the operation efficiency of the algorithm is not high, especially in the face of massive data mining, it is unable to complete the mining task normally. Therefore, an efficient co-location pattern approximation algorithm is proposed. The new algorithm first clusters according to the feature instances, takes each center as the new instance coordinates, and associates the number of instances of each family. On this basis, the mining area is divided into branches, and the distance threshold is taken for the row spacing, so as to achieve the purpose of fast pruning. On the premise of ensuring high accuracy, the algorithm effectively solves the efficiency problem of traditional algorithms, and effectively solves the spatial colocation pattern mining of massive data. A large number of experiments show that the new algorithm has the advantages of high efficiency, stability and high accuracy.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133716926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Super Resolution of Single Image Based on Multi Level Residual Self Attention Mechanism","authors":"Junfeng Mao, Yaqi Hu","doi":"10.1109/PRML52754.2021.9520742","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520742","url":null,"abstract":"The existing network models achieve good reconstruction effect by deepening the network depth, but most of them have problems such as insufficient feature information extraction, single scale of feature information, weak perception of valuable information and so on. In order to solve this problem, this paper proposes a single image super-resolution network based on multi-level residual self attention mechanism. Firstly, shallow features and deep features are extracted from the input low resolution image hierarchically, and then convolution operation is performed on the deep features and shallow features to obtain high resolution image. Compared with the existing comparison methods, the reconstruction effect of this method is better, and the objective evaluation indexes PSNR and SSIM are also improved.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133724979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Weixu Liu, Zhifeng Tang, Pengfei Zhang, Xiangxian Chen, Bin Yang
{"title":"Damage Detection in Switch Rails via Machine Learning","authors":"Weixu Liu, Zhifeng Tang, Pengfei Zhang, Xiangxian Chen, Bin Yang","doi":"10.1109/PRML52754.2021.9520705","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520705","url":null,"abstract":"Switch rail is a weak but essential component of high-speed rail (HSR) systems. Due to aging and the potential of fatigue damage accumulation, it has an urgent requirement for damage detection. An automatic classification method of switch rail damage based on feature integration and machine learning is proposed. According to the characteristics of switch rail and guided wave, several features extracted from different signal processing domains (such as time domain, power spectrum domain and time-frequency domain) are proposed and defined to characterize the complexity of switch rail damage. A damage index is defined to eliminate the effects of various environmental and operational conditions. A feature selection method based on binary particle swarm optimization (BPSO) is proposed. This method uses a new fitness function to select the most damage-sensitive features, eliminate the irrelevant and redundant features, and improve the classification performance. The least-squares support-vector machine (LS-SVM) is adopted to build an automatic classification model to reduce the probability of artificial error diagnosis and improve the generalization ability. Finally, experiment on the switch rail foot is conducted to verify the proposed method. The results show that the method has the ability of damage identification, which is better than traditional methods.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115503597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guanyu Lin, Lei Huang, Yuting Yin, Chengmin Zhang, Feng Zhu, Lingqi Kong, Zhiheng Li
{"title":"Efficient and Bias-aware Recommendation with Two-side Relevance for Implicit Feedback","authors":"Guanyu Lin, Lei Huang, Yuting Yin, Chengmin Zhang, Feng Zhu, Lingqi Kong, Zhiheng Li","doi":"10.1109/PRML52754.2021.9520701","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520701","url":null,"abstract":"Today’s wide-spread recommendation is usually constructed based on implicit data such as click for easy collection but whether the no clicked data is negative feedback or unobserved positive feedback confuses the model construction. As a response, Relevance Matrix Factorization (Rel-MF) is recently proposed to tackle this problem as well as the missing-not-at-random (MNAR) problem ignored by previous studies. However, Rel-MF meets three problems: limited assumption (LA), negative square loss (NSL) and indiscriminate no click data (INCD). In this paper, we first get rid of Rel-MF’s limited assumption and establish a more general theory by incorporating a defined transformation function which captures the relevance level to our two-side relevance ideal loss, containing Rel-MF’s theory. To resolve the INCD problem and NSL problem, we introduce an adjusting variable and perform normalization, respectively, which is called Naive Solution with Normalization for Rel-MF (NRel-MF). But we then analytically discover that the clipped function proposed by Rel-MF meets the high variance problem. To overcome it, we design a power clipped function and further propose Improved Solution with Power Function for Rel-MF (PRel-MF). Besides, we also explore propensity score estimation from user and hybrid perspectives in contrast to Rel-MF’s sole item perspective. Finally, we also consider and address the computational problem caused by the Rel-MF’s non-sampling strategy. Empirical results verify the effectiveness of our solutions from both performance even in rare items and loss decrease. In broader perspective experiment, decent performance is seen in item perspective with fewer recommended items while in user perspective with more recommended items and hybrid perspective outperforms them in more situations.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127236619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How Does Chinese Segmentation Strategy Effect on Sentiment Analysis of Short Text?","authors":"Qing Lei, Haifeng Li, Yanxi Chen","doi":"10.1109/PRML52754.2021.9520738","DOIUrl":"https://doi.org/10.1109/PRML52754.2021.9520738","url":null,"abstract":"In term of Chinese natural language processing, it exits one particular problem that how to choose the strategy of word segmentation, which commonly includes char-based and word-based. Targeted at sentiment analysis of short text comparing with long text, the word-based segmentation faces the other problem that there are the more ambiguous or unregistered words in context of short text. The feature extraction done by the different Chinese Word Segmentation impact the statistic distribution of features, and further the accuracy of sentiment analysis. This paper evaluates five Chinese segmentation strategy effect on Sentiment Analysis of Short Text. We chose two word-based Chinese Word Segmentation (CWS), and three char-based n-gram, then transformed Bag-of-Word (BOW) to Vector Space Model (VSM) which finally was fed into several classifiers to predict sentiment polarity of short text. To reduce the impact of corpora, the study is based a collection of five public corpora.","PeriodicalId":429603,"journal":{"name":"2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130686674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}