Synthesizing versus Augmentation for Arabic Word Recognition with Convolutional Neural Networks

Reem Alaasam, Berat Kurar Barakat, Jihad El-Sana
{"title":"Synthesizing versus Augmentation for Arabic Word Recognition with Convolutional Neural Networks","authors":"Reem Alaasam, Berat Kurar Barakat, Jihad El-Sana","doi":"10.1109/ASAR.2018.8480189","DOIUrl":null,"url":null,"abstract":"In this paper, we present a sub-word recognition method for historical Arabic manuscripts, using convolutional neural networks. We investigate the benefit of extending training set with synthetically created samples in comparison to augmentation. We show that annotating around ten pages of a manuscript and extending it, is sufficient for successful sub-word recognition in the whole manuscript. In addition, we show the contribution of using different combinations of training sets and compare their sub-word recognition performance in the whole manuscript.","PeriodicalId":165564,"journal":{"name":"2018 IEEE 2nd International Workshop on Arabic and Derived Script Analysis and Recognition (ASAR)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 2nd International Workshop on Arabic and Derived Script Analysis and Recognition (ASAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASAR.2018.8480189","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

In this paper, we present a sub-word recognition method for historical Arabic manuscripts, using convolutional neural networks. We investigate the benefit of extending training set with synthetically created samples in comparison to augmentation. We show that annotating around ten pages of a manuscript and extending it, is sufficient for successful sub-word recognition in the whole manuscript. In addition, we show the contribution of using different combinations of training sets and compare their sub-word recognition performance in the whole manuscript.
卷积神经网络在阿拉伯语词识别中的综合与增强
本文提出了一种基于卷积神经网络的阿拉伯历史手抄本子词识别方法。我们研究了用合成生成的样本扩展训练集与增强相比较的好处。我们的研究表明,对一篇手稿进行大约10页的注释并进行扩展,就足以在整个手稿中成功地识别子词。此外,我们展示了使用不同训练集组合的贡献,并比较了它们在整个手稿中的子词识别性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信