Wenjiao Bian, T. Wakahara, Tao Wu, He Tang, Jirui Lin
{"title":"基于深度神经网络的场景图像颜色字符串二值化","authors":"Wenjiao Bian, T. Wakahara, Tao Wu, He Tang, Jirui Lin","doi":"10.1109/DICTA.2018.8615837","DOIUrl":null,"url":null,"abstract":"This paper addresses the problem of binarizing multicolored character strings in scene images with complex backgrounds and heavy image degradations. The proposed method consists of three steps. The first step is combinatorial generation of binarized images via every dichotomization of K clusters obtained by K-means clustering of constituent pixels of an input image in the HSI color space. The second step is classification of each binarized image using deep neural network into two categories: character string and non-character string. The final step is selection of a single binarized image with the highest degree of character string as an optimal binarization result. Experimental results using ICDAR 2003 robust word recognition dataset show that the proposed method achieves a correct binarization rate of 87.4% that is highly competitive with the state of the art of binarization of scene character strings.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Binarization of Color Character Strings in Scene Images using Deep Neural Network\",\"authors\":\"Wenjiao Bian, T. Wakahara, Tao Wu, He Tang, Jirui Lin\",\"doi\":\"10.1109/DICTA.2018.8615837\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper addresses the problem of binarizing multicolored character strings in scene images with complex backgrounds and heavy image degradations. The proposed method consists of three steps. The first step is combinatorial generation of binarized images via every dichotomization of K clusters obtained by K-means clustering of constituent pixels of an input image in the HSI color space. The second step is classification of each binarized image using deep neural network into two categories: character string and non-character string. The final step is selection of a single binarized image with the highest degree of character string as an optimal binarization result. Experimental results using ICDAR 2003 robust word recognition dataset show that the proposed method achieves a correct binarization rate of 87.4% that is highly competitive with the state of the art of binarization of scene character strings.\",\"PeriodicalId\":130057,\"journal\":{\"name\":\"2018 Digital Image Computing: Techniques and Applications (DICTA)\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 Digital Image Computing: Techniques and Applications (DICTA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DICTA.2018.8615837\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Digital Image Computing: Techniques and Applications (DICTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DICTA.2018.8615837","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Binarization of Color Character Strings in Scene Images using Deep Neural Network
This paper addresses the problem of binarizing multicolored character strings in scene images with complex backgrounds and heavy image degradations. The proposed method consists of three steps. The first step is combinatorial generation of binarized images via every dichotomization of K clusters obtained by K-means clustering of constituent pixels of an input image in the HSI color space. The second step is classification of each binarized image using deep neural network into two categories: character string and non-character string. The final step is selection of a single binarized image with the highest degree of character string as an optimal binarization result. Experimental results using ICDAR 2003 robust word recognition dataset show that the proposed method achieves a correct binarization rate of 87.4% that is highly competitive with the state of the art of binarization of scene character strings.