{"title":"场景文本检测与识别算法","authors":"Guanjing Li","doi":"10.1109/cvidliccea56201.2022.9824815","DOIUrl":null,"url":null,"abstract":"In recent years, the detection and recognition of scene text have developed rapidly, but two difficult challenges have not been well solved. First, semantic analysis based on convolutional neural networks and powerful ImageNet pre-training incur high computational costs. Second, scene text detection with irregular shapes and irregular word order is inaccurate. Aiming at the above problems, this paper proposes a novel and lightweight network module (CSNet-PGNet) for real-time reading of a text of arbitrary shape and orientation. CSNet (Cross-Stage Cross-Scale network) is an extremely lightweight overall cross-stage and cross-scale network, which abandons the cumbersome CNN skeleton network (semantic classification) and can be trained from scratch. PGNet (Point Gathering Network) is a text detection recognizer that can detect and recognize the text of any shape, without the operation of Non-maximum Suppression (NMS) and Region of Interest (RoI), and has the advantages of end-to-end simplicity and efficiency. performance. This paper proposes the CSNet-PGNet scene curve text detection and recognition method, which is a development to more efficient and precise scene text detection of any shapes.","PeriodicalId":23649,"journal":{"name":"Vision","volume":"3 1","pages":"1217-1224"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"CSNet-PGNet: Algorithm for Scene Text Detection and Recognition\",\"authors\":\"Guanjing Li\",\"doi\":\"10.1109/cvidliccea56201.2022.9824815\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, the detection and recognition of scene text have developed rapidly, but two difficult challenges have not been well solved. First, semantic analysis based on convolutional neural networks and powerful ImageNet pre-training incur high computational costs. Second, scene text detection with irregular shapes and irregular word order is inaccurate. Aiming at the above problems, this paper proposes a novel and lightweight network module (CSNet-PGNet) for real-time reading of a text of arbitrary shape and orientation. CSNet (Cross-Stage Cross-Scale network) is an extremely lightweight overall cross-stage and cross-scale network, which abandons the cumbersome CNN skeleton network (semantic classification) and can be trained from scratch. PGNet (Point Gathering Network) is a text detection recognizer that can detect and recognize the text of any shape, without the operation of Non-maximum Suppression (NMS) and Region of Interest (RoI), and has the advantages of end-to-end simplicity and efficiency. performance. This paper proposes the CSNet-PGNet scene curve text detection and recognition method, which is a development to more efficient and precise scene text detection of any shapes.\",\"PeriodicalId\":23649,\"journal\":{\"name\":\"Vision\",\"volume\":\"3 1\",\"pages\":\"1217-1224\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Vision\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/cvidliccea56201.2022.9824815\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/cvidliccea56201.2022.9824815","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
CSNet-PGNet: Algorithm for Scene Text Detection and Recognition
In recent years, the detection and recognition of scene text have developed rapidly, but two difficult challenges have not been well solved. First, semantic analysis based on convolutional neural networks and powerful ImageNet pre-training incur high computational costs. Second, scene text detection with irregular shapes and irregular word order is inaccurate. Aiming at the above problems, this paper proposes a novel and lightweight network module (CSNet-PGNet) for real-time reading of a text of arbitrary shape and orientation. CSNet (Cross-Stage Cross-Scale network) is an extremely lightweight overall cross-stage and cross-scale network, which abandons the cumbersome CNN skeleton network (semantic classification) and can be trained from scratch. PGNet (Point Gathering Network) is a text detection recognizer that can detect and recognize the text of any shape, without the operation of Non-maximum Suppression (NMS) and Region of Interest (RoI), and has the advantages of end-to-end simplicity and efficiency. performance. This paper proposes the CSNet-PGNet scene curve text detection and recognition method, which is a development to more efficient and precise scene text detection of any shapes.