{"title":"基于自动编码器CNN的Devanagari场景文本识别","authors":"S. Shiravale, Jayadevan R, S. Sannakki","doi":"10.5565/REV/ELCVIA.1344","DOIUrl":null,"url":null,"abstract":"Scene text recognition is a well-rooted research domain covering a diverse application area. Recognition of scene text is challenging due to the complex nature of scene images. Various structural characteristics of the script also influence the recognition process. Text and background segmentation is a mandatory step in the scene text recognition process. A text recognition system produces the most accurate results if the structural and contextual information is preserved by the segmentation technique. Therefore, an attempt is made here to develop a robust foreground/background segmentation(separation) technique that produces the highest recognition results. A ground-truth dataset containing Devanagari scene text images is prepared for the experimentation. An encoder-decoder convolutional neural network model is used for text/background segmentation. The model is trained with Devanagari scene text images for pixel-wise classification of text and background. The segmented text is then recognized using an existing OCR engine (Tesseract). The word and character level recognition rates are computed and compared with other existing segmentation techniques to establish the effectiveness of the proposed technique.","PeriodicalId":38711,"journal":{"name":"Electronic Letters on Computer Vision and Image Analysis","volume":"37 1","pages":"55-69"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Recognition of Devanagari Scene Text Using Autoencoder CNN\",\"authors\":\"S. Shiravale, Jayadevan R, S. Sannakki\",\"doi\":\"10.5565/REV/ELCVIA.1344\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Scene text recognition is a well-rooted research domain covering a diverse application area. Recognition of scene text is challenging due to the complex nature of scene images. Various structural characteristics of the script also influence the recognition process. Text and background segmentation is a mandatory step in the scene text recognition process. A text recognition system produces the most accurate results if the structural and contextual information is preserved by the segmentation technique. Therefore, an attempt is made here to develop a robust foreground/background segmentation(separation) technique that produces the highest recognition results. A ground-truth dataset containing Devanagari scene text images is prepared for the experimentation. An encoder-decoder convolutional neural network model is used for text/background segmentation. The model is trained with Devanagari scene text images for pixel-wise classification of text and background. The segmented text is then recognized using an existing OCR engine (Tesseract). The word and character level recognition rates are computed and compared with other existing segmentation techniques to establish the effectiveness of the proposed technique.\",\"PeriodicalId\":38711,\"journal\":{\"name\":\"Electronic Letters on Computer Vision and Image Analysis\",\"volume\":\"37 1\",\"pages\":\"55-69\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-02-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Electronic Letters on Computer Vision and Image Analysis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5565/REV/ELCVIA.1344\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electronic Letters on Computer Vision and Image Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5565/REV/ELCVIA.1344","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Computer Science","Score":null,"Total":0}
Recognition of Devanagari Scene Text Using Autoencoder CNN
Scene text recognition is a well-rooted research domain covering a diverse application area. Recognition of scene text is challenging due to the complex nature of scene images. Various structural characteristics of the script also influence the recognition process. Text and background segmentation is a mandatory step in the scene text recognition process. A text recognition system produces the most accurate results if the structural and contextual information is preserved by the segmentation technique. Therefore, an attempt is made here to develop a robust foreground/background segmentation(separation) technique that produces the highest recognition results. A ground-truth dataset containing Devanagari scene text images is prepared for the experimentation. An encoder-decoder convolutional neural network model is used for text/background segmentation. The model is trained with Devanagari scene text images for pixel-wise classification of text and background. The segmented text is then recognized using an existing OCR engine (Tesseract). The word and character level recognition rates are computed and compared with other existing segmentation techniques to establish the effectiveness of the proposed technique.