Scene text spotting based on end-to-end

IF 1.5 Q2 COMPUTER SCIENCE, THEORY & METHODS
Guangcun Wei, Wansheng Rong, Yongquan Liang, Xinguang Xiao, Xiang Liu
{"title":"Scene text spotting based on end-to-end","authors":"Guangcun Wei, Wansheng Rong, Yongquan Liang, Xinguang Xiao, Xiang Liu","doi":"10.3233/JIFS-200903","DOIUrl":null,"url":null,"abstract":"Aiming at the problem that the traditional OCR processing method ignores the inherent connection between the text detection task and the text recognition task, This paper propose a novel end-to-end text spotting framework. The framework includes three parts: shared convolutional feature network, text detector and text recognizer. By sharing convolutional feature network, the text detection network and the text recognition network can be jointly optimized at the same time. On the one hand, it can reduce the computational burden; on the other hand, it can effectively use the inherent connection between text detection and text recognition. This model add the TCM (Text Context Module) on the basis of Mask RCNN, which can effectively solve the negative sample problem in text detection tasks. This paper propose a text recognition model based on the SAM-BiLSTM (spatial attention mechanism with BiLSTM), which can more effectively extract the semantic information between characters. This model significantly surpasses state-of-the-art methods on a number of text detection and text spotting benchmarks, including ICDAR 2015, Total-Text.","PeriodicalId":44705,"journal":{"name":"International Journal of Fuzzy Logic and Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":1.5000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Fuzzy Logic and Intelligent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/JIFS-200903","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Aiming at the problem that the traditional OCR processing method ignores the inherent connection between the text detection task and the text recognition task, This paper propose a novel end-to-end text spotting framework. The framework includes three parts: shared convolutional feature network, text detector and text recognizer. By sharing convolutional feature network, the text detection network and the text recognition network can be jointly optimized at the same time. On the one hand, it can reduce the computational burden; on the other hand, it can effectively use the inherent connection between text detection and text recognition. This model add the TCM (Text Context Module) on the basis of Mask RCNN, which can effectively solve the negative sample problem in text detection tasks. This paper propose a text recognition model based on the SAM-BiLSTM (spatial attention mechanism with BiLSTM), which can more effectively extract the semantic information between characters. This model significantly surpasses state-of-the-art methods on a number of text detection and text spotting benchmarks, including ICDAR 2015, Total-Text.
基于端到端的场景文本识别
针对传统OCR处理方法忽略文本检测任务和文本识别任务之间的内在联系的问题,提出了一种新的端到端文本识别框架。该框架包括三个部分:共享卷积特征网络、文本检测器和文本识别器。通过共享卷积特征网络,文本检测网络和文本识别网络可以同时进行联合优化。一方面,它可以减少计算量;另一方面,它可以有效地利用文本检测和文本识别之间的内在联系。该模型在Mask RCNN的基础上增加了TCM (Text Context Module),有效解决了文本检测任务中的负样本问题。本文提出了一种基于SAM-BiLSTM (spatial attention mechanism with BiLSTM)的文本识别模型,该模型能够更有效地提取字符间的语义信息。该模型在许多文本检测和文本定位基准上显著超过了最先进的方法,包括ICDAR 2015, Total-Text。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
2.80
自引率
23.10%
发文量
31
期刊介绍: The International Journal of Fuzzy Logic and Intelligent Systems (pISSN 1598-2645, eISSN 2093-744X) is published quarterly by the Korean Institute of Intelligent Systems. The official title of the journal is International Journal of Fuzzy Logic and Intelligent Systems and the abbreviated title is Int. J. Fuzzy Log. Intell. Syst. Some, or all, of the articles in the journal are indexed in SCOPUS, Korea Citation Index (KCI), DOI/CrossrRef, DBLP, and Google Scholar. The journal was launched in 2001 and dedicated to the dissemination of well-defined theoretical and empirical studies results that have a potential impact on the realization of intelligent systems based on fuzzy logic and intelligent systems theory. Specific topics include, but are not limited to: a) computational intelligence techniques including fuzzy logic systems, neural networks and evolutionary computation; b) intelligent control, instrumentation and robotics; c) adaptive signal and multimedia processing; d) intelligent information processing including pattern recognition and information processing; e) machine learning and smart systems including data mining and intelligent service practices; f) fuzzy theory and its applications.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信