{"title":"阿拉伯语印刷文本的光学字符识别","authors":"S. Taha, Y. Babiker, M. Abbas","doi":"10.1109/SCORED.2012.6518645","DOIUrl":null,"url":null,"abstract":"Optical character recognition (OCR) systems improve human- machine interaction. They are widely used in many areas such as editing and storing previously printed or handwritten documents. Much of research has been done regarding the identification of Latin, Japanese and Chinese characters. However, very little investigation has been performed regarding Arabic recognition. Probably the reason is limitation of IT activities in Arabic speaking countries and the difficulty and complexity of Arabic characters identification compared to the others. More difficulties are introduced from the cursive nature of Arabic text. In this paper, a technique has been employed to segment printed Arabic text in order to separate the Arabic characters and then extracting powerful features for each to be recognized. In-order to recognize characters, those features are then compared with a pre-prepared database fields. Although the database was prepared from characters written in Time New Roman font, experimental results show the relatively high accuracy of the method developed when it is tested on several sizes of several fonts beside Time New Roman font.","PeriodicalId":299947,"journal":{"name":"2012 IEEE Student Conference on Research and Development (SCOReD)","volume":"164 7","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Optical character recognition of arabic printed text\",\"authors\":\"S. Taha, Y. Babiker, M. Abbas\",\"doi\":\"10.1109/SCORED.2012.6518645\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Optical character recognition (OCR) systems improve human- machine interaction. They are widely used in many areas such as editing and storing previously printed or handwritten documents. Much of research has been done regarding the identification of Latin, Japanese and Chinese characters. However, very little investigation has been performed regarding Arabic recognition. Probably the reason is limitation of IT activities in Arabic speaking countries and the difficulty and complexity of Arabic characters identification compared to the others. More difficulties are introduced from the cursive nature of Arabic text. In this paper, a technique has been employed to segment printed Arabic text in order to separate the Arabic characters and then extracting powerful features for each to be recognized. In-order to recognize characters, those features are then compared with a pre-prepared database fields. Although the database was prepared from characters written in Time New Roman font, experimental results show the relatively high accuracy of the method developed when it is tested on several sizes of several fonts beside Time New Roman font.\",\"PeriodicalId\":299947,\"journal\":{\"name\":\"2012 IEEE Student Conference on Research and Development (SCOReD)\",\"volume\":\"164 7\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE Student Conference on Research and Development (SCOReD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SCORED.2012.6518645\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE Student Conference on Research and Development (SCOReD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCORED.2012.6518645","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Optical character recognition of arabic printed text
Optical character recognition (OCR) systems improve human- machine interaction. They are widely used in many areas such as editing and storing previously printed or handwritten documents. Much of research has been done regarding the identification of Latin, Japanese and Chinese characters. However, very little investigation has been performed regarding Arabic recognition. Probably the reason is limitation of IT activities in Arabic speaking countries and the difficulty and complexity of Arabic characters identification compared to the others. More difficulties are introduced from the cursive nature of Arabic text. In this paper, a technique has been employed to segment printed Arabic text in order to separate the Arabic characters and then extracting powerful features for each to be recognized. In-order to recognize characters, those features are then compared with a pre-prepared database fields. Although the database was prepared from characters written in Time New Roman font, experimental results show the relatively high accuracy of the method developed when it is tested on several sizes of several fonts beside Time New Roman font.