面向多语言场景文本检测的文本-上下文感知CNN网络

Yao Xiao, Minglong Xue, Tong Lu, Yirui Wu, P. Shivakumara
{"title":"面向多语言场景文本检测的文本-上下文感知CNN网络","authors":"Yao Xiao, Minglong Xue, Tong Lu, Yirui Wu, P. Shivakumara","doi":"10.1109/ICDAR.2019.00116","DOIUrl":null,"url":null,"abstract":"The existing deep learning based state-of-theart scene text detection methods treat scene texts a type of general objects, or segment text regions directly. The latter category achieves remarkable detection results on arbitraryorientation and large aspect ratios of scene texts based on instance segmentation algorithms. However, due to the lack of context information with consideration of scene text unique characteristics, directly applying instance segmentation to text detection task is prone to result in low accuracy, especially producing false positive detection results. To ease this problem, we propose a novel text-context-aware scene text detection CNN structure, which appropriately encodes channel and spatial attention information to construct context-aware and discriminative feature map for multi-oriented and multi-language text detection tasks. With high representation ability of textcontext-aware feature map, the proposed instance segmentation based method can not only robustly detect multi-oriented and multi-language text from natural scene images, but also produce better text detection results by greatly reducing false positives. Experiments on ICDAR2015 and ICDAR2017-MLT datasets show that the proposed method has achieved superior performances in precision, recall and F-measure than most of the existing studies.","PeriodicalId":325437,"journal":{"name":"2019 International Conference on Document Analysis and Recognition (ICDAR)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"A Text-Context-Aware CNN Network for Multi-oriented and Multi-language Scene Text Detection\",\"authors\":\"Yao Xiao, Minglong Xue, Tong Lu, Yirui Wu, P. Shivakumara\",\"doi\":\"10.1109/ICDAR.2019.00116\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The existing deep learning based state-of-theart scene text detection methods treat scene texts a type of general objects, or segment text regions directly. The latter category achieves remarkable detection results on arbitraryorientation and large aspect ratios of scene texts based on instance segmentation algorithms. However, due to the lack of context information with consideration of scene text unique characteristics, directly applying instance segmentation to text detection task is prone to result in low accuracy, especially producing false positive detection results. To ease this problem, we propose a novel text-context-aware scene text detection CNN structure, which appropriately encodes channel and spatial attention information to construct context-aware and discriminative feature map for multi-oriented and multi-language text detection tasks. With high representation ability of textcontext-aware feature map, the proposed instance segmentation based method can not only robustly detect multi-oriented and multi-language text from natural scene images, but also produce better text detection results by greatly reducing false positives. Experiments on ICDAR2015 and ICDAR2017-MLT datasets show that the proposed method has achieved superior performances in precision, recall and F-measure than most of the existing studies.\",\"PeriodicalId\":325437,\"journal\":{\"name\":\"2019 International Conference on Document Analysis and Recognition (ICDAR)\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Document Analysis and Recognition (ICDAR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDAR.2019.00116\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Document Analysis and Recognition (ICDAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2019.00116","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

摘要

现有的基于深度学习的场景文本检测方法将场景文本作为一般对象,或者直接分割文本区域。后一类基于实例分割算法的场景文本在任意方向和大宽高比下的检测效果显著。然而,考虑到场景文本的独特性,由于缺乏上下文信息,直接将实例分割应用于文本检测任务容易导致准确率低,特别是产生假阳性检测结果。为了解决这一问题,我们提出了一种新的文本-上下文感知场景文本检测CNN结构,该结构对通道和空间注意信息进行适当编码,构建上下文感知和判别特征映射,用于多方向、多语言的文本检测任务。该方法具有较高的文本上下文感知特征映射表示能力,不仅可以鲁棒地检测自然场景图像中的多方向、多语言文本,而且大大降低了误报率,产生了较好的文本检测效果。在ICDAR2015和ICDAR2017-MLT数据集上的实验表明,该方法在查全率、查全率和F-measure等方面都取得了较好的效果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Text-Context-Aware CNN Network for Multi-oriented and Multi-language Scene Text Detection
The existing deep learning based state-of-theart scene text detection methods treat scene texts a type of general objects, or segment text regions directly. The latter category achieves remarkable detection results on arbitraryorientation and large aspect ratios of scene texts based on instance segmentation algorithms. However, due to the lack of context information with consideration of scene text unique characteristics, directly applying instance segmentation to text detection task is prone to result in low accuracy, especially producing false positive detection results. To ease this problem, we propose a novel text-context-aware scene text detection CNN structure, which appropriately encodes channel and spatial attention information to construct context-aware and discriminative feature map for multi-oriented and multi-language text detection tasks. With high representation ability of textcontext-aware feature map, the proposed instance segmentation based method can not only robustly detect multi-oriented and multi-language text from natural scene images, but also produce better text detection results by greatly reducing false positives. Experiments on ICDAR2015 and ICDAR2017-MLT datasets show that the proposed method has achieved superior performances in precision, recall and F-measure than most of the existing studies.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信