场景文本检测与识别算法

Guanjing Li
{"title":"场景文本检测与识别算法","authors":"Guanjing Li","doi":"10.1109/cvidliccea56201.2022.9824815","DOIUrl":null,"url":null,"abstract":"In recent years, the detection and recognition of scene text have developed rapidly, but two difficult challenges have not been well solved. First, semantic analysis based on convolutional neural networks and powerful ImageNet pre-training incur high computational costs. Second, scene text detection with irregular shapes and irregular word order is inaccurate. Aiming at the above problems, this paper proposes a novel and lightweight network module (CSNet-PGNet) for real-time reading of a text of arbitrary shape and orientation. CSNet (Cross-Stage Cross-Scale network) is an extremely lightweight overall cross-stage and cross-scale network, which abandons the cumbersome CNN skeleton network (semantic classification) and can be trained from scratch. PGNet (Point Gathering Network) is a text detection recognizer that can detect and recognize the text of any shape, without the operation of Non-maximum Suppression (NMS) and Region of Interest (RoI), and has the advantages of end-to-end simplicity and efficiency. performance. This paper proposes the CSNet-PGNet scene curve text detection and recognition method, which is a development to more efficient and precise scene text detection of any shapes.","PeriodicalId":23649,"journal":{"name":"Vision","volume":"3 1","pages":"1217-1224"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"CSNet-PGNet: Algorithm for Scene Text Detection and Recognition\",\"authors\":\"Guanjing Li\",\"doi\":\"10.1109/cvidliccea56201.2022.9824815\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, the detection and recognition of scene text have developed rapidly, but two difficult challenges have not been well solved. First, semantic analysis based on convolutional neural networks and powerful ImageNet pre-training incur high computational costs. Second, scene text detection with irregular shapes and irregular word order is inaccurate. Aiming at the above problems, this paper proposes a novel and lightweight network module (CSNet-PGNet) for real-time reading of a text of arbitrary shape and orientation. CSNet (Cross-Stage Cross-Scale network) is an extremely lightweight overall cross-stage and cross-scale network, which abandons the cumbersome CNN skeleton network (semantic classification) and can be trained from scratch. PGNet (Point Gathering Network) is a text detection recognizer that can detect and recognize the text of any shape, without the operation of Non-maximum Suppression (NMS) and Region of Interest (RoI), and has the advantages of end-to-end simplicity and efficiency. performance. This paper proposes the CSNet-PGNet scene curve text detection and recognition method, which is a development to more efficient and precise scene text detection of any shapes.\",\"PeriodicalId\":23649,\"journal\":{\"name\":\"Vision\",\"volume\":\"3 1\",\"pages\":\"1217-1224\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Vision\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/cvidliccea56201.2022.9824815\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/cvidliccea56201.2022.9824815","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

近年来,场景文本的检测与识别发展迅速,但两个难题还没有得到很好的解决。首先,基于卷积神经网络的语义分析和强大的ImageNet预训练会带来很高的计算成本。其次,不规则形状和不规则词序的场景文本检测不准确。针对上述问题,本文提出了一种新颖的轻量级网络模块(CSNet-PGNet),用于实时读取任意形状和方向的文本。CSNet (Cross-Stage Cross-Scale network)是一个非常轻量级的整体跨阶段跨尺度网络,抛弃了繁琐的CNN骨架网络(语义分类),可以从头开始训练。PGNet (Point Gathering Network)是一种文本检测识别器,可以检测和识别任何形状的文本,不需要非最大抑制(NMS)和感兴趣区域(RoI)的操作,具有端到端简单和高效的优点。表演本文提出了CSNet-PGNet场景曲线文本检测与识别方法,是对任意形状的场景文本进行更高效、更精确检测的一种发展。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
CSNet-PGNet: Algorithm for Scene Text Detection and Recognition
In recent years, the detection and recognition of scene text have developed rapidly, but two difficult challenges have not been well solved. First, semantic analysis based on convolutional neural networks and powerful ImageNet pre-training incur high computational costs. Second, scene text detection with irregular shapes and irregular word order is inaccurate. Aiming at the above problems, this paper proposes a novel and lightweight network module (CSNet-PGNet) for real-time reading of a text of arbitrary shape and orientation. CSNet (Cross-Stage Cross-Scale network) is an extremely lightweight overall cross-stage and cross-scale network, which abandons the cumbersome CNN skeleton network (semantic classification) and can be trained from scratch. PGNet (Point Gathering Network) is a text detection recognizer that can detect and recognize the text of any shape, without the operation of Non-maximum Suppression (NMS) and Region of Interest (RoI), and has the advantages of end-to-end simplicity and efficiency. performance. This paper proposes the CSNet-PGNet scene curve text detection and recognition method, which is a development to more efficient and precise scene text detection of any shapes.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信