{"title":"零射击学习基于脚本识别在野外","authors":"Prateek Keserwani, K. De, P. Roy, U. Pal","doi":"10.1109/ICDAR.2019.00162","DOIUrl":null,"url":null,"abstract":"The text recognition system for natural images or video frames containing multilingual text needs a method to first identify the written script and then recognize the word in the identified script. However, the occurrence of some scripts is rare as compared to others. Due to the availability of a few samples of the rare script, the supervised learning of the deep neural networks is difficult. To overcome this problem, we have proposed a zero-shot learning based method for script identification. We have also proposed architecture for script identification which fuses the global feature vector and the semantic embedding vector. The semantic embedding of the script is obtained by using the spatial dependency of the stroke's sequence via the recurrent neural network. The proposed architecture shows superior results as compared to the baseline approaches.","PeriodicalId":325437,"journal":{"name":"2019 International Conference on Document Analysis and Recognition (ICDAR)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Zero Shot Learning Based Script Identification in the Wild\",\"authors\":\"Prateek Keserwani, K. De, P. Roy, U. Pal\",\"doi\":\"10.1109/ICDAR.2019.00162\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The text recognition system for natural images or video frames containing multilingual text needs a method to first identify the written script and then recognize the word in the identified script. However, the occurrence of some scripts is rare as compared to others. Due to the availability of a few samples of the rare script, the supervised learning of the deep neural networks is difficult. To overcome this problem, we have proposed a zero-shot learning based method for script identification. We have also proposed architecture for script identification which fuses the global feature vector and the semantic embedding vector. The semantic embedding of the script is obtained by using the spatial dependency of the stroke's sequence via the recurrent neural network. The proposed architecture shows superior results as compared to the baseline approaches.\",\"PeriodicalId\":325437,\"journal\":{\"name\":\"2019 International Conference on Document Analysis and Recognition (ICDAR)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Document Analysis and Recognition (ICDAR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDAR.2019.00162\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Document Analysis and Recognition (ICDAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2019.00162","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Zero Shot Learning Based Script Identification in the Wild
The text recognition system for natural images or video frames containing multilingual text needs a method to first identify the written script and then recognize the word in the identified script. However, the occurrence of some scripts is rare as compared to others. Due to the availability of a few samples of the rare script, the supervised learning of the deep neural networks is difficult. To overcome this problem, we have proposed a zero-shot learning based method for script identification. We have also proposed architecture for script identification which fuses the global feature vector and the semantic embedding vector. The semantic embedding of the script is obtained by using the spatial dependency of the stroke's sequence via the recurrent neural network. The proposed architecture shows superior results as compared to the baseline approaches.