{"title":"HMM-based Offline Arabic Handwriting Recognition: Using New Feature Extraction and Lexicon Ranking Techniques","authors":"Hesham M. Eraqi, S. Abdelazeem","doi":"10.1109/ICFHR.2012.214","DOIUrl":null,"url":null,"abstract":"In this paper, a new offline Arabic handwriting recognition system is presented. The Douglas-Peucker algorithm is applied on the skeletonized parts of the offline images to convert it into piecewise linear curves that are used for efficient detection of diacritics, noise segments, and the baseline. A hidden Markov model (HMM)-based system is used with features extracted from the image before and after removing the diacritics. A reliable method of lexicon ranking and reduction based on the information of the image's diacritics, number of piece of Arabic words (PAWs), and dimensions information is used. The proposed system has been tested using the IFN/ENIT database and has achieved promising recognition rates.","PeriodicalId":291062,"journal":{"name":"2012 International Conference on Frontiers in Handwriting Recognition","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 International Conference on Frontiers in Handwriting Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICFHR.2012.214","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
In this paper, a new offline Arabic handwriting recognition system is presented. The Douglas-Peucker algorithm is applied on the skeletonized parts of the offline images to convert it into piecewise linear curves that are used for efficient detection of diacritics, noise segments, and the baseline. A hidden Markov model (HMM)-based system is used with features extracted from the image before and after removing the diacritics. A reliable method of lexicon ranking and reduction based on the information of the image's diacritics, number of piece of Arabic words (PAWs), and dimensions information is used. The proposed system has been tested using the IFN/ENIT database and has achieved promising recognition rates.