{"title":"Natural scene text recognition based on artificial intelligence machine learning","authors":"Jun Yin, Jianye Zhang, Degao Li","doi":"10.1117/12.2685586","DOIUrl":"https://doi.org/10.1117/12.2685586","url":null,"abstract":"The text of physical scene images often contains a lot of accurate and advanced semantic information. Due to the rapid development of mobile networks and computer vision technology, this information has been widely used in applications such as geographic location, license plate identification, and unmanned driving. Therefore, this article mainly investigates the natural scene text recognition of artificial intelligence machine learning, and understands the relevant basic theories of natural scene text recognition on the basis of literature data, and then renews the natural scene text recognition using artificial intelligence machine learning, and test the quoted text recognition algorithm. The conclusion of the test is that the recognition accuracy of the algorithm in this paper is 83.7%, so the natural scene recognition model designed in this paper is effective.","PeriodicalId":305812,"journal":{"name":"International Conference on Electronic Information Technology","volume":"264 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115282947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design and implementation of Elasticsearch-based intelligent search for home shopping platform","authors":"Shengming Zhang, Zhihong Pan, Jinhong Chen, Jinghao Zhou, Weinan Wang, Donglin Wu, Shenyin Wan, Kangdong Chen","doi":"10.1117/12.2685777","DOIUrl":"https://doi.org/10.1117/12.2685777","url":null,"abstract":"The search function of shopping software is essential, and good recommendations can increase the user's desire to buy. The fuzzy search and image similarity search proposed in this paper is a new retrieval method built on Elasticsearch, which can speed up the search and improve the retrieval's correctness. Its support for various complex texts dramatically facilitates the development of this project. This search type is used in home shopping software to improve the user's comfort significantly. The text is developed and designed based on the Golang language, whose high concurrency and excellent library functions help implement the functionality extensively. The user side is presented as a WeChat applet, which lowers the threshold of use and increases the dependency of users. With Elasticsearch's support for multiple languages and its unique vector search and text embedding features, the system can train models such as Contrastive Language-Image Pretraining (CLIP) and Natural Language Processing (NLP) on different images and languages, improving the search's accuracy. For the model generated by the training, vector search is performed to achieve the purpose of the search, and finally, the search results are returned to the front-end applet page for exhibition.","PeriodicalId":305812,"journal":{"name":"International Conference on Electronic Information Technology","volume":"262 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124280397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tower crane anti-collision algorithm design and simulation","authors":"Yang Zhang, Xinrong Li","doi":"10.1117/12.2685476","DOIUrl":"https://doi.org/10.1117/12.2685476","url":null,"abstract":"As the main transportation tool at the construction site, the tower crane is essential for operation. In order to avoid collisions with the surrounding obstacles or other tower cranes when running, this article designs a tower crane anti -collision algorithm. This algorithm first models the tower crane and obstacles, and realizes the collision monitoring between the tower crane and the obstacles by dynamically calibrating the alarm area around the obstacles; The information interaction, combined with the trajectory prediction algorithm to implement the collision monitoring between the tower cranes; at the same time, it is proposed to apply the Kalman filter to the trajectory prediction of the tower crane to improve the accuracy of the risk prediction; in the end, through the QT design simulation platform, And use randomly generated tower cranes to run data to verify its effectiveness. After testing, the algorithm can effectively avoid the occurrence of tower crane collision accidents, and has certain application prospects.","PeriodicalId":305812,"journal":{"name":"International Conference on Electronic Information Technology","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127163012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A high-power RF transmission line status monitoring method based on sound recognition","authors":"F. Zeng, W. Ma, Liang Lu","doi":"10.1117/12.2685447","DOIUrl":"https://doi.org/10.1117/12.2685447","url":null,"abstract":"To ensure the safe operation of a superconducting accelerator system, real-time monitoring of the RF power source, transmission lines, and superconducting cavities is essential. Currently, the main method for monitoring the status of transmission lines in superconducting accelerator systems is through monitoring the standing wave ratio. However, it is difficult to effectively monitor faults during high reflection or full reflection operations, which can pose significant safety risks. To address this issue, this paper proposes an online monitoring and positioning technique for RF transmission line faults based on acoustic fingerprinting. By studying the spectral characteristics and transmission mechanism of high-power RF transmission line faults, the sound recognition and classification experiment achieved a recognition accuracy of 98.0%, demonstrating the feasibility of this method in identifying faults in RF transmission lines.","PeriodicalId":305812,"journal":{"name":"International Conference on Electronic Information Technology","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125325887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient iterative decoding algorithm of RS-LDPC concatenated schemes with 5G-LDPC codes","authors":"Kai Liu, Ming Jiang, Lijie Hu","doi":"10.1117/12.2685769","DOIUrl":"https://doi.org/10.1117/12.2685769","url":null,"abstract":"Despite the performance of 5th Generation Low Density Parity Check (5G-LDPC) codes close to the capacity limit, the near-optimal floating-point BP decoding will lead to excessive use of hardware and high complexity of calculation. Thus, the quantized min-sum decoder with limited-precision is usually used in practical implementations. No matter using high or low precision quantization, some 5G-LDPC codes always suffer from the problem of high error floor, which affects their applications in future ultra-reliable scenarios. Using Reed-Solomon (RS) codes as the outer codes can significantly lower the error floor of 5G-LDPC codes due to their Hamming distance properties and excellent error-correction capability. In order to further improve the overall performance, an efficient iterative decoding algorithm of RS-LDPC concatenated codes is proposed in this paper. The proposed iterative algorithm of RS-LDPC codes can achieve noticeable gains over the 5G-LDPC codes at lower FER.","PeriodicalId":305812,"journal":{"name":"International Conference on Electronic Information Technology","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114954237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Huayong Liu, Cong Huang, Hanjun Jin, Xiaosi Fu, Pei Shi
{"title":"Visual transformer-based image retrieval with multiple loss fusion","authors":"Huayong Liu, Cong Huang, Hanjun Jin, Xiaosi Fu, Pei Shi","doi":"10.1117/12.2685738","DOIUrl":"https://doi.org/10.1117/12.2685738","url":null,"abstract":"Through hash learning, the image retrieval based on deep hash algorithm encodes the image into a fixed length hash code for fast retrieval and matching. However, previous deep hash retrieval models based on convolutional neural networks extract local information of the image using pooling and convolution technology, which requires deeper networks to obtain long distance dependency, leading to high complexity and computation. In this paper, we propose a visual Transformer model based on self-attention to learn long dependencies of images and enhance the extraction ability of image features. Furthermore, a loss function with multiple loss fusion is proposed, which combines hash contrastive loss, classification loss, and quantization loss, to fully utilize image label information to improve the quality of hash coding by learning more potential semantic information. Experimental results demonstrate the superior performance of the proposed method over multiple classical deep hash retrieval methods based on CNN and two transformer-based hash retrieval methods, on two different datasets and different lengths of hash code.","PeriodicalId":305812,"journal":{"name":"International Conference on Electronic Information Technology","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122653683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Joint entity and relation extraction with part-of-speech-aware attention and dependency parsing embedding","authors":"huaiqian he","doi":"10.1117/12.2685463","DOIUrl":"https://doi.org/10.1117/12.2685463","url":null,"abstract":"Joint entity and relation extraction is an important task in natural language processing, whose purpose is to obtain all triples in text. However, the existing models seldom pay attention to the part-of-speech (pos) of each word and the dependency parsing (dp) in the sentence. To solve these problems. a joint extraction model with part-of-speech-aware attention and dependency parsing embedding is proposed, named PADPE. The proposed model obtains better word representation through pos-aware attention mechanism. In addition, the parts of speech and dependency characteristics are integrated respectively in entity classification and relation classification to improve the accuracy of the classifier. The experimental results demonstrate that our model can solve the overlapping triple problem more effectively and outperform other baselines on three public datasets.","PeriodicalId":305812,"journal":{"name":"International Conference on Electronic Information Technology","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129519446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Research on drug box fault detection based on improved YoLov4","authors":"Zedong Wu, Zhiqiang Zhang, Wenhui Zhu, Bao-hua Wu, Kaixuan Liu, Yining Hao","doi":"10.1117/12.2685856","DOIUrl":"https://doi.org/10.1117/12.2685856","url":null,"abstract":"In order to solve the problem of fault detection and identification of drug boxes on the conveyor belt of automatic drug vending machine, a target detection algorithm based on machine vision and deep neural network of efficient channel and spatial attention mechanism was proposed, named AT-YOLOV4. Firstly, the data set of Western medicine box fault detection was constructed. Secondly, the target detection model YOLOv4 with One-Stage structure was adopted, and the backbone network of the model was improved. In the Backbone network of this model, the efficient channel and spatial attention mechanism is integrated into the backbone module of YOLOv4 model. The improved model was compared with the unimproved YOLOv4 model, YOLOv3 model, YOLOv3-SPP model and YOLOv5s model for the correlation algorithm index experiments. Results The AT-YOLOV4 model with the efficient channel attention mechanism can effectively improve the recognition rate of the drug box and reduce the weight of the model. The AT-YOLOv4 model was significantly superior to other models in accuracy, recall rate and mean accuracy, and the mean accuracy of drug box identification reached 99.6%.","PeriodicalId":305812,"journal":{"name":"International Conference on Electronic Information Technology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129296899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design and implementation of seismic safety evaluation technical service system","authors":"Andong Zhang, Zhanling Fu, Wuping Gao, C. Yan","doi":"10.1117/12.2685744","DOIUrl":"https://doi.org/10.1117/12.2685744","url":null,"abstract":"Accompanied by the advancement of seismic engineering technology and the establishment of a seismic management system, seismic safety evaluation (hereinafter referred to as “safety evaluation”) provides a fundamental basis for seismic design and construction of major construction projects and plays an important role in the field of seismic safety. In this paper, the computer software technology is used to design, develop and implement the safety evaluation technical service system for the scenarios of data collection, quality supervision, and application of safety evaluation results. The implementation of the system not only addresses the problems in the existing work, but also realizes the tasks of applying the results on the constructor side, managing the results on the undertaker side, and supervising the industry on the supervisor side, in an attempt to effectively ensure the scientificity, reliability, and accuracy of the safety evaluation results and improve the overall service quality and level of the industry in the region.","PeriodicalId":305812,"journal":{"name":"International Conference on Electronic Information Technology","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123615077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Face attribute editing network based on style-content disentanglement and convolutional attention","authors":"Jiansheng Cui, Quansheng Dou","doi":"10.1117/12.2685479","DOIUrl":"https://doi.org/10.1117/12.2685479","url":null,"abstract":"Face attribute editing is a research hotspot in the field of computer vision, which aims to modify a certain attribute of a face image to generate a new face image. The current methods based on Generative Adversarial Networks (GAN) have attribute entanglement problems and the implementation process is relatively complicated. To this end, this paper proposes a face attribute editing network based on style-content disentanglement and convolutional attention. Adding convolutional attention (CAT) module to the StyleGAN generator makes the network's control of content features no longer affected by the overall style of the image, and realizes the separation of spatial content and style from coarse to fine. In addition, the hierarchical CAT modules control different levels of attribute features, and changing the input of any layer of CAT can change the corresponding attribute features. The experimental results on the CelebA-HQ dataset show that the method in this paper can achieve disentangled editing of face attributes, and the scores of various indicators are better than the existing models.","PeriodicalId":305812,"journal":{"name":"International Conference on Electronic Information Technology","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114061501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}