{"title":"基于轮廓和骨架分割的阿拉伯笔迹分割","authors":"S. Wshah, Zhixin Shi, V. Govindaraju","doi":"10.1109/ICDAR.2009.152","DOIUrl":null,"url":null,"abstract":"We propose a new algorithm for segmentation of off-line handwritten Arabic words. The algorithm segments the connected letters to smaller segments each of which contains no more than three letters. Each letter may be segmented to at most five pieces. In addition to improving the recognition of Arabic words, another potential application of the proposed segmentation method is to build lexicon of small size, consisting of no more than three letter combinations. Generally, it is very hard to generate lexicon for recognition of unconstraint handwritten Arabic documents due to the large number of words of Arabic language.The algorithm has been tested on over 6300 words from 45 different documents written by 18 writers. The system is able to segment more than 93% of the words into segments, each containing at most one letter, 6% of the words into segments that contains two letters and 3% of the words into segments that contains three letters.","PeriodicalId":433762,"journal":{"name":"2009 10th International Conference on Document Analysis and Recognition","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"51","resultStr":"{\"title\":\"Segmentation of Arabic Handwriting Based on both Contour and Skeleton Segmentation\",\"authors\":\"S. Wshah, Zhixin Shi, V. Govindaraju\",\"doi\":\"10.1109/ICDAR.2009.152\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose a new algorithm for segmentation of off-line handwritten Arabic words. The algorithm segments the connected letters to smaller segments each of which contains no more than three letters. Each letter may be segmented to at most five pieces. In addition to improving the recognition of Arabic words, another potential application of the proposed segmentation method is to build lexicon of small size, consisting of no more than three letter combinations. Generally, it is very hard to generate lexicon for recognition of unconstraint handwritten Arabic documents due to the large number of words of Arabic language.The algorithm has been tested on over 6300 words from 45 different documents written by 18 writers. The system is able to segment more than 93% of the words into segments, each containing at most one letter, 6% of the words into segments that contains two letters and 3% of the words into segments that contains three letters.\",\"PeriodicalId\":433762,\"journal\":{\"name\":\"2009 10th International Conference on Document Analysis and Recognition\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-07-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"51\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 10th International Conference on Document Analysis and Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDAR.2009.152\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 10th International Conference on Document Analysis and Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2009.152","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Segmentation of Arabic Handwriting Based on both Contour and Skeleton Segmentation
We propose a new algorithm for segmentation of off-line handwritten Arabic words. The algorithm segments the connected letters to smaller segments each of which contains no more than three letters. Each letter may be segmented to at most five pieces. In addition to improving the recognition of Arabic words, another potential application of the proposed segmentation method is to build lexicon of small size, consisting of no more than three letter combinations. Generally, it is very hard to generate lexicon for recognition of unconstraint handwritten Arabic documents due to the large number of words of Arabic language.The algorithm has been tested on over 6300 words from 45 different documents written by 18 writers. The system is able to segment more than 93% of the words into segments, each containing at most one letter, 6% of the words into segments that contains two letters and 3% of the words into segments that contains three letters.