{"title":"TIRec: Transformer-based Invoice Text Recognition","authors":"Yanlan Chen","doi":"10.1145/3590003.3590034","DOIUrl":"https://doi.org/10.1145/3590003.3590034","url":null,"abstract":"A novel invoice text recognition model is proposed. In the past few years, researchers have explored text recognition methods with RNN-like structures to model semantic information. However, RNN-based approaches have some obvious drawbacks, such as the level-by-level decoding approach and the one-way serial transmission of semantic information, which greatly limit semantic information's effectiveness and computational efficiency. In contrast, invoice text has obvious contextual relationships due to its fixed text pattern, the text font in the invoice is more fixed and the complexity of the background is much lower than that of natural scenes. To further exploit these contextual relationships and adapt to the characteristics of invoice text, we propose a new text recognition framework inspired by Transformer [1]. Self-attention-based architectures, in particular Transformer, have been successful in natural language processing (NLP). It has demonstrated powerful semantic information modeling capabilities in NLP. Inspired by its success, we try to apply Transformer to invoice text recognition. Unlike the RNN-based approach, we reduce the parameters of the vision network used to extract image features, use the Convolutional Vision Transformer Attention module to capture the semantic information, and use the Transformer decoding module to decode all characters in parallel. We hope that this Transformer-based architecture can better model the semantic information in invoices while remaining lightweight. Meanwhile, we collected text images of more than 40,000 train invoices, VAT invoices, rolled invoices, and cab invoices. Experiments on the collected invoice text recognition dataset show that our approach outperforms previous methods in terms of accuracy and speed.","PeriodicalId":340225,"journal":{"name":"Proceedings of the 2023 2nd Asia Conference on Algorithms, Computing and Machine Learning","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125562497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Foreign object recognition method of transmission line based on improved outlier rate method","authors":"Dongmei Liu, Zhongwang Zhu, Bo Chen","doi":"10.1145/3590003.3590073","DOIUrl":"https://doi.org/10.1145/3590003.3590073","url":null,"abstract":"Foreign matters hanging on the transmission line can be regarded as a potential risk of the transmission system, which will not only affect the normal power supply of the transmission line, but also pose a greater threat to pedestrians and vehicles under the line. Aiming at the low efficiency and high false detection rate of traditional foreign object recognition methods for hanging foreign objects, this paper proposes a foreign object recognition method for transmission lines based on improved outlier rate method. It proposes to use Hough line transformation to extract the transmission line, and then conduct convolution operation on the area where the transmission line is located and the non-transmission line area, and set the corresponding outlier rate in combination with the actual error to identify the foreign matters in the transmission line.","PeriodicalId":340225,"journal":{"name":"Proceedings of the 2023 2nd Asia Conference on Algorithms, Computing and Machine Learning","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129878352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PhyGNNet: Solving spatiotemporal PDEs with Physics-informed Graph Neural Network","authors":"Longxiang Jiang, Liyuan Wang, Xinkun Chu, Yonghao Xiao, Hao Zhang","doi":"10.1145/3590003.3590029","DOIUrl":"https://doi.org/10.1145/3590003.3590029","url":null,"abstract":"Partial differential equations (PDEs) are a common means of describing physical processes. Solving PDEs can obtain simulated results of physical evolution. Currently, the mainstream neural network method is to minimize the loss of PDEs thus constraining neural networks to fit the solution mappings. By the implementation of differentiation, the methods can be divided into PINN methods based on automatic differentiation and other methods based on discrete differentiation. PINN methods rely on automatic backpropagation, and the computation step is time-consuming, for iterative training, the complexity of the neural network and the number of collocation points are limited to a small condition, thus abating accuracy. The discrete differentiation is more efficient in computation, following the regular computational domain assumption. However, in practice, the assumption does not necessarily hold. In this paper, we propose a PhyGNNet method to solve PDEs based on graph neural network and discrete differentiation on irregular domain. Meanwhile, to verify the validity of the method, we solve Burgers equation and conduct a numerical comparison with PINN. The results show that the proposed method performs better both in fit ability and time extrapolation than PINN. Code is available at https://github.com/echowve/phygnnet.","PeriodicalId":340225,"journal":{"name":"Proceedings of the 2023 2nd Asia Conference on Algorithms, Computing and Machine Learning","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127997209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proceedings of the 2023 2nd Asia Conference on Algorithms, Computing and Machine Learning","authors":"","doi":"10.1145/3590003","DOIUrl":"https://doi.org/10.1145/3590003","url":null,"abstract":"","PeriodicalId":340225,"journal":{"name":"Proceedings of the 2023 2nd Asia Conference on Algorithms, Computing and Machine Learning","volume":"02 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129128390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}