{"title":"一种基于模糊和轮廓的法律文件手写体印地语分词方法","authors":"Rahul Pramanik, Soumen Bag, Ranjeet Kumar","doi":"10.1109/RAIT.2018.8389031","DOIUrl":null,"url":null,"abstract":"Automated recognition system for handwritten Hindi words in legal documents is an essential requirement in India. In order to achieve good recognition accuracy, precise segmentation is necessary. Segmentation algorithms for Hindi language mostly uses zone identification as a pre-segmentation stage. In the present work, we propose a character segmentation method that identifies the different zones of a word image and utilizes a fuzzy function for estimating the headline pixels and further uses the outer contour of the word along with the estimated headline pixels to segment the upper and lower modifiers, and meaningful constituent characters. The proposed method can be efficiently used in word images that have slight slant. We have delineated that this work can be effectively used to segment handwritten Hindi words in bank cheques for effective recognition. We have further experimented on a well-known dataset to show the efficacy of our proposed methodology.","PeriodicalId":219972,"journal":{"name":"2018 4th International Conference on Recent Advances in Information Technology (RAIT)","volume":"81 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"A fuzzy and contour-based segmentation methodology for handwritten Hindi words in legal documents\",\"authors\":\"Rahul Pramanik, Soumen Bag, Ranjeet Kumar\",\"doi\":\"10.1109/RAIT.2018.8389031\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automated recognition system for handwritten Hindi words in legal documents is an essential requirement in India. In order to achieve good recognition accuracy, precise segmentation is necessary. Segmentation algorithms for Hindi language mostly uses zone identification as a pre-segmentation stage. In the present work, we propose a character segmentation method that identifies the different zones of a word image and utilizes a fuzzy function for estimating the headline pixels and further uses the outer contour of the word along with the estimated headline pixels to segment the upper and lower modifiers, and meaningful constituent characters. The proposed method can be efficiently used in word images that have slight slant. We have delineated that this work can be effectively used to segment handwritten Hindi words in bank cheques for effective recognition. We have further experimented on a well-known dataset to show the efficacy of our proposed methodology.\",\"PeriodicalId\":219972,\"journal\":{\"name\":\"2018 4th International Conference on Recent Advances in Information Technology (RAIT)\",\"volume\":\"81 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-03-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 4th International Conference on Recent Advances in Information Technology (RAIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/RAIT.2018.8389031\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 4th International Conference on Recent Advances in Information Technology (RAIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RAIT.2018.8389031","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A fuzzy and contour-based segmentation methodology for handwritten Hindi words in legal documents
Automated recognition system for handwritten Hindi words in legal documents is an essential requirement in India. In order to achieve good recognition accuracy, precise segmentation is necessary. Segmentation algorithms for Hindi language mostly uses zone identification as a pre-segmentation stage. In the present work, we propose a character segmentation method that identifies the different zones of a word image and utilizes a fuzzy function for estimating the headline pixels and further uses the outer contour of the word along with the estimated headline pixels to segment the upper and lower modifiers, and meaningful constituent characters. The proposed method can be efficiently used in word images that have slight slant. We have delineated that this work can be effectively used to segment handwritten Hindi words in bank cheques for effective recognition. We have further experimented on a well-known dataset to show the efficacy of our proposed methodology.