{"title":"利用曲率尺度空间特征识别波斯语和拉丁文字","authors":"M. Khoddami, A. Behrad","doi":"10.1109/NEUREL.2010.5644061","DOIUrl":null,"url":null,"abstract":"Script recognition is a necessary process before OCR algorithm in multilingual systems. In this paper, a novel method is proposed for identifying Farsi and Latin scripts in bilingual document using curvature scale space features. The proposed features are rotation and scale invariant and can be used to identify scripts with different fonts. We assumed that the bilingual scripts may have Farsi and English words and characters together; therefore the algorithm is designed to be able to recognize scripts in the connected components level. The output of the recognition is then generalized to word, line and page levels. Experimental results show that the proposed method has good accuracy especially in word and connected component levels.","PeriodicalId":227890,"journal":{"name":"10th Symposium on Neural Network Applications in Electrical Engineering","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"Farsi and Latin script identification using curvature scale space features\",\"authors\":\"M. Khoddami, A. Behrad\",\"doi\":\"10.1109/NEUREL.2010.5644061\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Script recognition is a necessary process before OCR algorithm in multilingual systems. In this paper, a novel method is proposed for identifying Farsi and Latin scripts in bilingual document using curvature scale space features. The proposed features are rotation and scale invariant and can be used to identify scripts with different fonts. We assumed that the bilingual scripts may have Farsi and English words and characters together; therefore the algorithm is designed to be able to recognize scripts in the connected components level. The output of the recognition is then generalized to word, line and page levels. Experimental results show that the proposed method has good accuracy especially in word and connected component levels.\",\"PeriodicalId\":227890,\"journal\":{\"name\":\"10th Symposium on Neural Network Applications in Electrical Engineering\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-11-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"10th Symposium on Neural Network Applications in Electrical Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NEUREL.2010.5644061\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"10th Symposium on Neural Network Applications in Electrical Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NEUREL.2010.5644061","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Farsi and Latin script identification using curvature scale space features
Script recognition is a necessary process before OCR algorithm in multilingual systems. In this paper, a novel method is proposed for identifying Farsi and Latin scripts in bilingual document using curvature scale space features. The proposed features are rotation and scale invariant and can be used to identify scripts with different fonts. We assumed that the bilingual scripts may have Farsi and English words and characters together; therefore the algorithm is designed to be able to recognize scripts in the connected components level. The output of the recognition is then generalized to word, line and page levels. Experimental results show that the proposed method has good accuracy especially in word and connected component levels.