A irregular text detection via dilated recombination and efficient reorganization on natural scene

IF 3.5 3区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS
Liwen Huang, Wenyuan Yang
{"title":"A irregular text detection via dilated recombination and efficient reorganization on natural scene","authors":"Liwen Huang, Wenyuan Yang","doi":"10.1007/s00530-024-01360-6","DOIUrl":null,"url":null,"abstract":"<p>In recent years, scene text detection has brought out broader prospects via growing applied opportunities. Nevertheless, pointing out which detected capability and suitable instantaneity in equilibrium is an essential consideration of irregular text detection. Out of consideration for the trouble, we propose an efficient scene text detector that unites a Dilated Recombined Unit (DRU) and a Efficient Reorganized Unit (ERU), named DENet. In the beginning, input feature information is received into a DR-VanillaNet backbone. Dilated recombined unit is devised to insert into every block of DR-VanillaNet to heighten the connection about distant pixel points. Next, an FPN with efficient reorganized unit tends to exploit feature redundancy and permutate channels partially. Correspondingly, DRU and ERU work on constructive effect for precision with a limited descent of speed. Moreover, a progressive scale expansion is carried forward which maintains the ability to generate the adjacent instances successfully. Multiple experiments on CTW1500, Total-Text benchmark datasets prove that designed model intends to improve precision accompanied by a limited drop of speed. It is specifically indicated that the value of precision on these two datasets reaches 84.29% and 85.30%. And FPS are achieved by 8.6 and 10.9, respectively.</p>","PeriodicalId":51138,"journal":{"name":"Multimedia Systems","volume":"29 1","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2024-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Multimedia Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00530-024-01360-6","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

In recent years, scene text detection has brought out broader prospects via growing applied opportunities. Nevertheless, pointing out which detected capability and suitable instantaneity in equilibrium is an essential consideration of irregular text detection. Out of consideration for the trouble, we propose an efficient scene text detector that unites a Dilated Recombined Unit (DRU) and a Efficient Reorganized Unit (ERU), named DENet. In the beginning, input feature information is received into a DR-VanillaNet backbone. Dilated recombined unit is devised to insert into every block of DR-VanillaNet to heighten the connection about distant pixel points. Next, an FPN with efficient reorganized unit tends to exploit feature redundancy and permutate channels partially. Correspondingly, DRU and ERU work on constructive effect for precision with a limited descent of speed. Moreover, a progressive scale expansion is carried forward which maintains the ability to generate the adjacent instances successfully. Multiple experiments on CTW1500, Total-Text benchmark datasets prove that designed model intends to improve precision accompanied by a limited drop of speed. It is specifically indicated that the value of precision on these two datasets reaches 84.29% and 85.30%. And FPS are achieved by 8.6 and 10.9, respectively.

Abstract Image

在自然场景中通过扩张重组和高效重组进行不规则文本检测
近年来,场景文本检测的应用机会越来越多,带来了更广阔的前景。然而,在不规则文本检测中,如何确定检测能力和合适的瞬时平衡是一个重要的考虑因素。出于对这一问题的考虑,我们提出了一种结合了稀释重组单元(DRU)和高效重组单元(ERU)的高效场景文本检测器,命名为 DENet。首先,输入的特征信息被接收到 DR-VanillaNet 骨干网。在 DR-VanillaNet 的每个区块中插入扩张重组单元,以加强与远处像素点的连接。接下来,带有高效重组单元的 FPN 往往会利用特征冗余和部分排列通道。相应地,DRU 和 ERU 利用构造效应来提高精度,但速度下降有限。此外,DRU 和 ERU 还进行了渐进式规模扩展,从而保持了成功生成相邻实例的能力。在 CTW1500、Total-Text 基准数据集上进行的多次实验证明,所设计的模型旨在提高精确度,同时限制速度的下降。具体表现在,这两个数据集的精确度值分别达到了 84.29% 和 85.30%。而 FPS 分别达到了 8.6 和 10.9。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Multimedia Systems
Multimedia Systems 工程技术-计算机:理论方法
CiteScore
5.40
自引率
7.70%
发文量
148
审稿时长
4.5 months
期刊介绍: This journal details innovative research ideas, emerging technologies, state-of-the-art methods and tools in all aspects of multimedia computing, communication, storage, and applications. It features theoretical, experimental, and survey articles.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信