文本线提取使用深度学习和最小子接缝

Adi Azran, A. Schclar, Raid Saabni
{"title":"文本线提取使用深度学习和最小子接缝","authors":"Adi Azran, A. Schclar, Raid Saabni","doi":"10.1145/3469096.3474941","DOIUrl":null,"url":null,"abstract":"Accurate text line extraction is a vital prerequisite for efficient and successful text recognition systems ranging from keywords/phrases searching to complete conversion to text. In many cases, the proposed algorithms target binary pre-processed versions of the image, which may cause insufficient results due to poor quality document images. Recently, more papers present solutions that work directly on gray-level images [1,2,7,12,15]. In this paper, we present a novel robust, and efficient algorithm to extract text-lines directly from gray-level document images. The proposed approach uses a combination of two variants of Convolutional Neural Network (CNNs), followed by minimal energy seam extraction. The first ConvNet is a modified version of the autoencoder used for biomedical image segmentation [8]. The second is a deep convolutional Neural Network, working on overlapping vertical slices of the original image. The two variants are combined to one neural net after re-attaching the resulting slices of the second net. The merged results of the two nets are used as a preprocessed image to obtain an energy map for a second phase. In the second step, we use the algorithm presented in [2], to track minimal energy sub-seams accumulated to perform a full local minimal/maximal separating and medial seam defining the text baselines and the text line regions. We have tested our approach on multi-lingual various datasets written at a range of image quality based on the ICDAR datasets.","PeriodicalId":423462,"journal":{"name":"Proceedings of the 21st ACM Symposium on Document Engineering","volume":"2016 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Text line extraction using deep learning and minimal sub seams\",\"authors\":\"Adi Azran, A. Schclar, Raid Saabni\",\"doi\":\"10.1145/3469096.3474941\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Accurate text line extraction is a vital prerequisite for efficient and successful text recognition systems ranging from keywords/phrases searching to complete conversion to text. In many cases, the proposed algorithms target binary pre-processed versions of the image, which may cause insufficient results due to poor quality document images. Recently, more papers present solutions that work directly on gray-level images [1,2,7,12,15]. In this paper, we present a novel robust, and efficient algorithm to extract text-lines directly from gray-level document images. The proposed approach uses a combination of two variants of Convolutional Neural Network (CNNs), followed by minimal energy seam extraction. The first ConvNet is a modified version of the autoencoder used for biomedical image segmentation [8]. The second is a deep convolutional Neural Network, working on overlapping vertical slices of the original image. The two variants are combined to one neural net after re-attaching the resulting slices of the second net. The merged results of the two nets are used as a preprocessed image to obtain an energy map for a second phase. In the second step, we use the algorithm presented in [2], to track minimal energy sub-seams accumulated to perform a full local minimal/maximal separating and medial seam defining the text baselines and the text line regions. We have tested our approach on multi-lingual various datasets written at a range of image quality based on the ICDAR datasets.\",\"PeriodicalId\":423462,\"journal\":{\"name\":\"Proceedings of the 21st ACM Symposium on Document Engineering\",\"volume\":\"2016 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-08-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 21st ACM Symposium on Document Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3469096.3474941\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 21st ACM Symposium on Document Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3469096.3474941","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

准确的文本行提取是有效和成功的文本识别系统的重要前提,从关键字/短语搜索到完全转换为文本。在许多情况下,所提出的算法针对的是图像的二值预处理版本,这可能会由于文档图像质量差而导致结果不足。最近,更多的论文提出了直接在灰度图像上工作的解决方案[1,2,7,12,15]。在本文中,我们提出了一种新的鲁棒、高效的算法来直接从灰度文档图像中提取文本行。该方法结合了卷积神经网络(cnn)的两种变体,然后进行最小能量层提取。第一个卷积神经网络是用于生物医学图像分割的自编码器的改进版本[8]。第二种是深度卷积神经网络,处理原始图像的重叠垂直切片。在重新连接第二个网络的结果切片后,这两个变体被合并到一个神经网络中。将两个网络的合并结果作为预处理图像,获得第二阶段的能量图。在第二步中,我们使用[2]中提出的算法来跟踪累积的最小能量子接缝,以执行完整的局部最小/最大分离和定义文本基线和文本线区域的内侧接缝。我们已经在基于ICDAR数据集的一系列图像质量的多语言各种数据集上测试了我们的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Text line extraction using deep learning and minimal sub seams
Accurate text line extraction is a vital prerequisite for efficient and successful text recognition systems ranging from keywords/phrases searching to complete conversion to text. In many cases, the proposed algorithms target binary pre-processed versions of the image, which may cause insufficient results due to poor quality document images. Recently, more papers present solutions that work directly on gray-level images [1,2,7,12,15]. In this paper, we present a novel robust, and efficient algorithm to extract text-lines directly from gray-level document images. The proposed approach uses a combination of two variants of Convolutional Neural Network (CNNs), followed by minimal energy seam extraction. The first ConvNet is a modified version of the autoencoder used for biomedical image segmentation [8]. The second is a deep convolutional Neural Network, working on overlapping vertical slices of the original image. The two variants are combined to one neural net after re-attaching the resulting slices of the second net. The merged results of the two nets are used as a preprocessed image to obtain an energy map for a second phase. In the second step, we use the algorithm presented in [2], to track minimal energy sub-seams accumulated to perform a full local minimal/maximal separating and medial seam defining the text baselines and the text line regions. We have tested our approach on multi-lingual various datasets written at a range of image quality based on the ICDAR datasets.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信