具有级联多维关注的场景文本检测

2023 3rd International Conference on Consumer Electronics and Computer Engineering (ICCECE) Pub Date : 2023-01-06 DOI:10.1109/ICCECE58074.2023.10135187

Shan Dai

{"title":"具有级联多维关注的场景文本检测","authors":"Shan Dai","doi":"10.1109/ICCECE58074.2023.10135187","DOIUrl":null,"url":null,"abstract":"Over the past years, scene text detection based on a segmentation network has progressed substantially due to its pixel-level description, which is more suitable for detecting long text and curved text. However, limited by the scale robustness and feature representation ability, most existing segmentation-based scene text detectors may need help to handle more complex forms of text, which is common in the real world. In this paper, to tackle this problem, we propose a cascaded module, termed CMAModule, based on the attention mechanism to improve the feature representation capability of the model, which integrates a series of the basic module to augment the feature map. Our proposed CMANet, obtained higher recall and precision on two benchmarks.","PeriodicalId":120030,"journal":{"name":"2023 3rd International Conference on Consumer Electronics and Computer Engineering (ICCECE)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Scene Text Detection with Cascaded Multidimensional Attention\",\"authors\":\"Shan Dai\",\"doi\":\"10.1109/ICCECE58074.2023.10135187\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Over the past years, scene text detection based on a segmentation network has progressed substantially due to its pixel-level description, which is more suitable for detecting long text and curved text. However, limited by the scale robustness and feature representation ability, most existing segmentation-based scene text detectors may need help to handle more complex forms of text, which is common in the real world. In this paper, to tackle this problem, we propose a cascaded module, termed CMAModule, based on the attention mechanism to improve the feature representation capability of the model, which integrates a series of the basic module to augment the feature map. Our proposed CMANet, obtained higher recall and precision on two benchmarks.\",\"PeriodicalId\":120030,\"journal\":{\"name\":\"2023 3rd International Conference on Consumer Electronics and Computer Engineering (ICCECE)\",\"volume\":\"44 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 3rd International Conference on Consumer Electronics and Computer Engineering (ICCECE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCECE58074.2023.10135187\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 3rd International Conference on Consumer Electronics and Computer Engineering (ICCECE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCECE58074.2023.10135187","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

近年来，基于分割网络的场景文本检测由于其像素级描述而取得了长足的进步，更适合于检测长文本和弯曲文本。然而，受规模鲁棒性和特征表示能力的限制，大多数现有的基于分割的场景文本检测器可能需要帮助来处理更复杂的文本形式，这在现实世界中很常见。为了解决这一问题，本文提出了一种基于注意机制的级联模块CMAModule，该模块集成了一系列基本模块来增强特征映射，以提高模型的特征表示能力。我们提出的CMANet在两个基准上获得了更高的查全率和查准率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Scene Text Detection with Cascaded Multidimensional Attention

Over the past years, scene text detection based on a segmentation network has progressed substantially due to its pixel-level description, which is more suitable for detecting long text and curved text. However, limited by the scale robustness and feature representation ability, most existing segmentation-based scene text detectors may need help to handle more complex forms of text, which is common in the real world. In this paper, to tackle this problem, we propose a cascaded module, termed CMAModule, based on the attention mechanism to improve the feature representation capability of the model, which integrates a series of the basic module to augment the feature map. Our proposed CMANet, obtained higher recall and precision on two benchmarks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 3rd International Conference on Consumer Electronics and Computer Engineering (ICCECE)

自引率

0.00%

发文量