Arbitrary shape text detection fusing InceptionNeXt and multi-scale attention mechanism

Xianguo Li, Yu Zhang, Yi Liu, Xingchen Yao, Xinyi Zhou
{"title":"Arbitrary shape text detection fusing InceptionNeXt and multi-scale attention mechanism","authors":"Xianguo Li, Yu Zhang, Yi Liu, Xingchen Yao, Xinyi Zhou","doi":"10.1007/s11227-024-06418-w","DOIUrl":null,"url":null,"abstract":"<p>Existing segmentation-based text detection methods generally face the problems of insufficient receptive fields, insufficient text information filtering, and difficulty in balancing detection accuracy and speed, limiting their ability to detect arbitrary-shaped text in complex backgrounds. To address these problems, we propose a new text detection method fusing the pure ConvNet model InceptionNeXt and the multi-scale attention mechanism. Firstly, we propose a text information reinforcement module to fully extract effective text information from features of different scales while preserving spatial position information. Secondly, we construct the InceptionNeXt Block module to compensate for insufficient receptive fields without significantly reducing speed. Finally, the INA-DBNet network structure is designed to fuse local and global features and achieve the balance of accuracy and speed. Experimental results demonstrate the efficacy of our method. Particularly, on the MSRA-TD500 and Total-text datasets, INA-DBNet achieves 91.3% and 86.7% <i>F</i>-measure while maintaining real-time inference speed. Code is available at: https://github.com/yuyu678/INANET.</p>","PeriodicalId":501596,"journal":{"name":"The Journal of Supercomputing","volume":"79 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Journal of Supercomputing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s11227-024-06418-w","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Existing segmentation-based text detection methods generally face the problems of insufficient receptive fields, insufficient text information filtering, and difficulty in balancing detection accuracy and speed, limiting their ability to detect arbitrary-shaped text in complex backgrounds. To address these problems, we propose a new text detection method fusing the pure ConvNet model InceptionNeXt and the multi-scale attention mechanism. Firstly, we propose a text information reinforcement module to fully extract effective text information from features of different scales while preserving spatial position information. Secondly, we construct the InceptionNeXt Block module to compensate for insufficient receptive fields without significantly reducing speed. Finally, the INA-DBNet network structure is designed to fuse local and global features and achieve the balance of accuracy and speed. Experimental results demonstrate the efficacy of our method. Particularly, on the MSRA-TD500 and Total-text datasets, INA-DBNet achieves 91.3% and 86.7% F-measure while maintaining real-time inference speed. Code is available at: https://github.com/yuyu678/INANET.

Abstract Image

融合 InceptionNeXt 和多尺度关注机制的任意形状文本检测
现有的基于分割的文本检测方法普遍面临感受野不足、文本信息过滤不充分、检测精度和速度难以兼顾等问题,限制了其在复杂背景中检测任意形状文本的能力。针对这些问题,我们提出了一种融合纯 ConvNet 模型 InceptionNeXt 和多尺度注意力机制的新文本检测方法。首先,我们提出了文本信息强化模块,在保留空间位置信息的同时,从不同尺度的特征中充分提取有效的文本信息。其次,我们构建了 InceptionNeXt Block 模块,以在不显著降低速度的情况下补偿不足的感受野。最后,我们设计了 INA-DBNet 网络结构,以融合局部和全局特征,实现准确性和速度的平衡。实验结果证明了我们方法的有效性。特别是在 MSRA-TD500 和 Total-text 数据集上,INA-DBNet 在保持实时推理速度的同时,F-measure 分别达到了 91.3% 和 86.7%。代码见:https://github.com/yuyu678/INANET。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信