使用HMM Toolkit (HTK)识别离线阿拉伯手写单词

Hicham El Moubtahij, A. Halli, K. Satori
{"title":"使用HMM Toolkit (HTK)识别离线阿拉伯手写单词","authors":"Hicham El Moubtahij, A. Halli, K. Satori","doi":"10.1109/CGIV.2016.40","DOIUrl":null,"url":null,"abstract":"There are a lot of difficulties facing a good handwritten Arabic recognition system such as the similarities of different character shapes and the unlimited variants in human handwriting. This paper presents a handwriting Arabic word recognition system. The objective of this approach is to propose an analytical offline recognition method of handwritten Arabic for rapid implementation. The first part in the writing recognition system is the preprocessing phase that prepares the data which serves to introduce and extract a set of simple statistical features by a window sliding along that text line from the right to left, then it injects the resulting feature vectors to the Hidden Markov Model Toolkit (HTK). In the recognition phase, the concatenation of characters to form words is modelled by simple lexical models, each word is modelled by a stochastic finite-state automaton (SFSA). The proposed system is applied to an \"Arabic-Numbers\" data corpus, which contains 47 words and 1905 sentences. These sentences are written by five different peoples.","PeriodicalId":351561,"journal":{"name":"2016 13th International Conference on Computer Graphics, Imaging and Visualization (CGiV)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Recognition of Off-Line Arabic Handwriting Words Using HMM Toolkit (HTK)\",\"authors\":\"Hicham El Moubtahij, A. Halli, K. Satori\",\"doi\":\"10.1109/CGIV.2016.40\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"There are a lot of difficulties facing a good handwritten Arabic recognition system such as the similarities of different character shapes and the unlimited variants in human handwriting. This paper presents a handwriting Arabic word recognition system. The objective of this approach is to propose an analytical offline recognition method of handwritten Arabic for rapid implementation. The first part in the writing recognition system is the preprocessing phase that prepares the data which serves to introduce and extract a set of simple statistical features by a window sliding along that text line from the right to left, then it injects the resulting feature vectors to the Hidden Markov Model Toolkit (HTK). In the recognition phase, the concatenation of characters to form words is modelled by simple lexical models, each word is modelled by a stochastic finite-state automaton (SFSA). The proposed system is applied to an \\\"Arabic-Numbers\\\" data corpus, which contains 47 words and 1905 sentences. These sentences are written by five different peoples.\",\"PeriodicalId\":351561,\"journal\":{\"name\":\"2016 13th International Conference on Computer Graphics, Imaging and Visualization (CGiV)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-05-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 13th International Conference on Computer Graphics, Imaging and Visualization (CGiV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CGIV.2016.40\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 13th International Conference on Computer Graphics, Imaging and Visualization (CGiV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CGIV.2016.40","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

摘要

一个好的手写阿拉伯语识别系统面临着许多困难,如不同字符形状的相似性和人类笔迹的无限变体。提出了一种手写阿拉伯语单词识别系统。该方法的目的是提出一种快速实现的手写阿拉伯语的分析离线识别方法。书写识别系统的第一部分是预处理阶段,该阶段准备数据,通过沿着文本行从右向左滑动的窗口引入和提取一组简单的统计特征,然后将结果特征向量注入隐马尔可夫模型工具包(HTK)。在识别阶段,通过简单的词法模型对字符连接形成单词进行建模,每个单词由随机有限状态自动机(SFSA)建模。该系统应用于一个包含47个单词和1905个句子的“阿拉伯数字”数据语料库。这些句子是由五个不同的民族写的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Recognition of Off-Line Arabic Handwriting Words Using HMM Toolkit (HTK)
There are a lot of difficulties facing a good handwritten Arabic recognition system such as the similarities of different character shapes and the unlimited variants in human handwriting. This paper presents a handwriting Arabic word recognition system. The objective of this approach is to propose an analytical offline recognition method of handwritten Arabic for rapid implementation. The first part in the writing recognition system is the preprocessing phase that prepares the data which serves to introduce and extract a set of simple statistical features by a window sliding along that text line from the right to left, then it injects the resulting feature vectors to the Hidden Markov Model Toolkit (HTK). In the recognition phase, the concatenation of characters to form words is modelled by simple lexical models, each word is modelled by a stochastic finite-state automaton (SFSA). The proposed system is applied to an "Arabic-Numbers" data corpus, which contains 47 words and 1905 sentences. These sentences are written by five different peoples.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信