W-A net: Leveraging Atrous and Deformable Convolutions for Efficient Text Detection

Sukhad Anand, Z. Khan
{"title":"W-A net: Leveraging Atrous and Deformable Convolutions for Efficient Text Detection","authors":"Sukhad Anand, Z. Khan","doi":"10.1109/DICTA51227.2020.9363428","DOIUrl":null,"url":null,"abstract":"Scene text detection has been gaining a lots of focus in research. Even though the recent methods are able to detect text in complex background having complex shapes with a fairly good accuracy, they still suffer from issues of limited receptive field. These fail from detecting extremely short or long words hence failing in detecting text words precisely in document text images. We propose a new model which we call W-A net, because of it's W shape with the middle branch being Atrous convolutional layers. Our model predicts a segmentation map which divides the image into word and no word regions and also, a boundary map which helps to segregate closer words from each other. We use Atrous convolutions and Deformable convolutional layers to increase the receptive field which helps to detect long words in an image. We treat text detection problem as a single problem irrespective of the background, making our model suitable of detecting text in scene or document images. We present our findings on two scene text datasets and a receipt dataset. Our results show that our method performs better than recent scene text detection methods which perform poorly on document text images, especially receipt images with short words.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 Digital Image Computing: Techniques and Applications (DICTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DICTA51227.2020.9363428","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Scene text detection has been gaining a lots of focus in research. Even though the recent methods are able to detect text in complex background having complex shapes with a fairly good accuracy, they still suffer from issues of limited receptive field. These fail from detecting extremely short or long words hence failing in detecting text words precisely in document text images. We propose a new model which we call W-A net, because of it's W shape with the middle branch being Atrous convolutional layers. Our model predicts a segmentation map which divides the image into word and no word regions and also, a boundary map which helps to segregate closer words from each other. We use Atrous convolutions and Deformable convolutional layers to increase the receptive field which helps to detect long words in an image. We treat text detection problem as a single problem irrespective of the background, making our model suitable of detecting text in scene or document images. We present our findings on two scene text datasets and a receipt dataset. Our results show that our method performs better than recent scene text detection methods which perform poorly on document text images, especially receipt images with short words.
W-A网络:利用自然和可变形卷积进行有效的文本检测
场景文本检测一直是研究的热点。尽管目前的方法能够较好地检测复杂背景、复杂形状的文本,但仍然存在接受域有限的问题。这些方法无法检测到极短或极长的单词,从而无法准确地检测到文档文本图像中的文本单词。我们提出了一个新的模型,我们称之为W- a网络,因为它是W形的,中间分支是阿特罗斯卷积层。我们的模型预测了一个分割图,它将图像划分为有词和无词区域,还有一个边界图,它有助于分离彼此之间更接近的词。我们使用亚特罗斯卷积和可变形卷积层来增加接收场,这有助于检测图像中的长单词。我们将文本检测问题视为一个单独的问题,而不考虑背景,使我们的模型适合于检测场景或文档图像中的文本。我们在两个场景文本数据集和一个收据数据集上展示了我们的发现。我们的结果表明,我们的方法优于当前的场景文本检测方法,这些方法在文档文本图像,特别是带有短单词的收据图像上表现不佳。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信