Sequence Recognition of Scene Text Based on CRNN and CTPN Models

Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering Pub Date : 2022-10-21 DOI:10.1145/3573428.3573462

Yiyi Liu

{"title":"Sequence Recognition of Scene Text Based on CRNN and CTPN Models","authors":"Yiyi Liu","doi":"10.1145/3573428.3573462","DOIUrl":null,"url":null,"abstract":"Image-based sequence recognition has lately emerged as a prominent study subject in the science of computer vision, while text detection and identification in natural situations has emerged as an active research field. Based on scene text data, this paper addresses the theory of deep learning-based CRNN and CTPN models and the process of processing text. Using CRNN, text recognition can be turned into a time-dependent sequence learning issue, which is commonly employed for indeterminate-length text sequences. Contextual relationships between text images are learned using BLSTM and CTC, thus effectively improving text recognition accuracy and making the model more robust. It also excels in text recognition tests for wordless and lexical-based scenes, as it is not constrained by any predefined language. It produces a more efficient, but smaller, model that is more suited to real-world settings. CRNN recognition accuracy is lower for short texts with large morphological changes, such as artistic words, or texts with large changes in natural scenes. Because of the Anchor setting, CTPN can only detect horizontally distributed text, but a small improvement can detect vertical text by adding horizontal Anchor. As a result of the limitations of the framework, the irregularly inclined text can be detected very broadly.","PeriodicalId":314698,"journal":{"name":"Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3573428.3573462","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Image-based sequence recognition has lately emerged as a prominent study subject in the science of computer vision, while text detection and identification in natural situations has emerged as an active research field. Based on scene text data, this paper addresses the theory of deep learning-based CRNN and CTPN models and the process of processing text. Using CRNN, text recognition can be turned into a time-dependent sequence learning issue, which is commonly employed for indeterminate-length text sequences. Contextual relationships between text images are learned using BLSTM and CTC, thus effectively improving text recognition accuracy and making the model more robust. It also excels in text recognition tests for wordless and lexical-based scenes, as it is not constrained by any predefined language. It produces a more efficient, but smaller, model that is more suited to real-world settings. CRNN recognition accuracy is lower for short texts with large morphological changes, such as artistic words, or texts with large changes in natural scenes. Because of the Anchor setting, CTPN can only detect horizontally distributed text, but a small improvement can detect vertical text by adding horizontal Anchor. As a result of the limitations of the framework, the irregularly inclined text can be detected very broadly.

查看原文本刊更多论文

基于CRNN和CTPN模型的场景文本序列识别

基于图像的序列识别是近年来计算机视觉科学的一个重要研究课题，而自然情况下的文本检测和识别是一个活跃的研究领域。本文以场景文本数据为基础，阐述了基于深度学习的CRNN和CTPN模型的理论和文本处理过程。使用CRNN，文本识别可以转化为一个时间依赖的序列学习问题，这通常用于不确定长度的文本序列。利用BLSTM和CTC学习文本图像之间的上下文关系，有效地提高了文本识别的准确率，增强了模型的鲁棒性。它在无词和基于词汇的场景的文本识别测试中也表现出色，因为它不受任何预定义语言的约束。它产生了一个更高效，但更小的模型，更适合现实世界的设置。对于形态变化较大的短文本，如艺术词汇，或自然场景中变化较大的文本，CRNN识别准确率较低。由于锚的设置，CTPN只能检测水平分布的文本，但通过添加水平锚，可以检测垂直文本。由于框架的限制，不规则倾斜文本的检测范围很广。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering

自引率

0.00%

发文量