基于深度神经网络的场景图像颜色字符串二值化

2018 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2018-12-01 DOI:10.1109/DICTA.2018.8615837

Wenjiao Bian, T. Wakahara, Tao Wu, He Tang, Jirui Lin

{"title":"基于深度神经网络的场景图像颜色字符串二值化","authors":"Wenjiao Bian, T. Wakahara, Tao Wu, He Tang, Jirui Lin","doi":"10.1109/DICTA.2018.8615837","DOIUrl":null,"url":null,"abstract":"This paper addresses the problem of binarizing multicolored character strings in scene images with complex backgrounds and heavy image degradations. The proposed method consists of three steps. The first step is combinatorial generation of binarized images via every dichotomization of K clusters obtained by K-means clustering of constituent pixels of an input image in the HSI color space. The second step is classification of each binarized image using deep neural network into two categories: character string and non-character string. The final step is selection of a single binarized image with the highest degree of character string as an optimal binarization result. Experimental results using ICDAR 2003 robust word recognition dataset show that the proposed method achieves a correct binarization rate of 87.4% that is highly competitive with the state of the art of binarization of scene character strings.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Binarization of Color Character Strings in Scene Images using Deep Neural Network\",\"authors\":\"Wenjiao Bian, T. Wakahara, Tao Wu, He Tang, Jirui Lin\",\"doi\":\"10.1109/DICTA.2018.8615837\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper addresses the problem of binarizing multicolored character strings in scene images with complex backgrounds and heavy image degradations. The proposed method consists of three steps. The first step is combinatorial generation of binarized images via every dichotomization of K clusters obtained by K-means clustering of constituent pixels of an input image in the HSI color space. The second step is classification of each binarized image using deep neural network into two categories: character string and non-character string. The final step is selection of a single binarized image with the highest degree of character string as an optimal binarization result. Experimental results using ICDAR 2003 robust word recognition dataset show that the proposed method achieves a correct binarization rate of 87.4% that is highly competitive with the state of the art of binarization of scene character strings.\",\"PeriodicalId\":130057,\"journal\":{\"name\":\"2018 Digital Image Computing: Techniques and Applications (DICTA)\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 Digital Image Computing: Techniques and Applications (DICTA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DICTA.2018.8615837\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Digital Image Computing: Techniques and Applications (DICTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DICTA.2018.8615837","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

本文研究了背景复杂、图像退化严重的场景图像中彩色字符串的二值化问题。该方法分为三个步骤。第一步是通过对HSI色彩空间中输入图像的组成像素的K-means聚类获得的K个聚类的每一个二分类来组合生成二值化图像。第二步是利用深度神经网络将二值化后的图像分为字符串和非字符串两类。最后一步是选择具有最高字符串度的单幅二值化图像作为最佳二值化结果。使用ICDAR 2003鲁棒词识别数据集进行的实验结果表明，该方法的二值化正确率达到87.4%，与当前场景字符串二值化技术相比具有很强的竞争力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Binarization of Color Character Strings in Scene Images using Deep Neural Network

This paper addresses the problem of binarizing multicolored character strings in scene images with complex backgrounds and heavy image degradations. The proposed method consists of three steps. The first step is combinatorial generation of binarized images via every dichotomization of K clusters obtained by K-means clustering of constituent pixels of an input image in the HSI color space. The second step is classification of each binarized image using deep neural network into two categories: character string and non-character string. The final step is selection of a single binarized image with the highest degree of character string as an optimal binarization result. Experimental results using ICDAR 2003 robust word recognition dataset show that the proposed method achieves a correct binarization rate of 87.4% that is highly competitive with the state of the art of binarization of scene character strings.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 Digital Image Computing: Techniques and Applications (DICTA)

自引率

0.00%

发文量