{"title":"基于k均值聚类算法的非均匀光照文档图像二值化","authors":"Xingxin Yang, Y. Wan","doi":"10.1109/icicn52636.2021.9674011","DOIUrl":null,"url":null,"abstract":"Good binarization result is of great help to the afterwords document image analysis and optical character recognition(OCR). However, non-uniform illumination document image binarization is a very challenging task due to high variation between the document background and foreground. This paper describes a new K-Means clustering based algorithm for non-uniform illumination document image binarization to solve this problem. In the proposed technique, we firstly obtain the combined edge map by take intersection of Canny’s edge map and local image contrast. Then divide the document image into small blocks, each block is classified as text and non-text block using our proposed algorithm. Finally, binarize the text block using K-Means clustering centroids. The proposed technique has been evaluated over nine Non-uniform illumination document images extracted from DIBCO datasets and one scene light reflection document image. Experimental results show that our proposed method achieves competitive performance among other six state-of-the-art binarization algorithm.","PeriodicalId":231379,"journal":{"name":"2021 IEEE 9th International Conference on Information, Communication and Networks (ICICN)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Non-uniform Illumination Document Image Binarization Using K-Means Clustering Algorithm\",\"authors\":\"Xingxin Yang, Y. Wan\",\"doi\":\"10.1109/icicn52636.2021.9674011\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Good binarization result is of great help to the afterwords document image analysis and optical character recognition(OCR). However, non-uniform illumination document image binarization is a very challenging task due to high variation between the document background and foreground. This paper describes a new K-Means clustering based algorithm for non-uniform illumination document image binarization to solve this problem. In the proposed technique, we firstly obtain the combined edge map by take intersection of Canny’s edge map and local image contrast. Then divide the document image into small blocks, each block is classified as text and non-text block using our proposed algorithm. Finally, binarize the text block using K-Means clustering centroids. The proposed technique has been evaluated over nine Non-uniform illumination document images extracted from DIBCO datasets and one scene light reflection document image. Experimental results show that our proposed method achieves competitive performance among other six state-of-the-art binarization algorithm.\",\"PeriodicalId\":231379,\"journal\":{\"name\":\"2021 IEEE 9th International Conference on Information, Communication and Networks (ICICN)\",\"volume\":\"52 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE 9th International Conference on Information, Communication and Networks (ICICN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/icicn52636.2021.9674011\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 9th International Conference on Information, Communication and Networks (ICICN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icicn52636.2021.9674011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Non-uniform Illumination Document Image Binarization Using K-Means Clustering Algorithm
Good binarization result is of great help to the afterwords document image analysis and optical character recognition(OCR). However, non-uniform illumination document image binarization is a very challenging task due to high variation between the document background and foreground. This paper describes a new K-Means clustering based algorithm for non-uniform illumination document image binarization to solve this problem. In the proposed technique, we firstly obtain the combined edge map by take intersection of Canny’s edge map and local image contrast. Then divide the document image into small blocks, each block is classified as text and non-text block using our proposed algorithm. Finally, binarize the text block using K-Means clustering centroids. The proposed technique has been evaluated over nine Non-uniform illumination document images extracted from DIBCO datasets and one scene light reflection document image. Experimental results show that our proposed method achieves competitive performance among other six state-of-the-art binarization algorithm.