Y. Akbari, A. Britto, S. Al-Maadeed, Luiz Oliveira
{"title":"基于预测双通道图像的卷积神经网络退化文档图像二值化","authors":"Y. Akbari, A. Britto, S. Al-Maadeed, Luiz Oliveira","doi":"10.1109/ICDAR.2019.00160","DOIUrl":null,"url":null,"abstract":"Due to the poor condition of most of historical documents, binarization is difficult to separate document image background pixels from foreground pixels. This paper proposes Convolutional Neural Networks (CNNs) based on predicted two-channel images in which CNNs are trained to classify the foreground pixels. The promising results from the use of multispectral images for semantic segmentation inspired our efforts to create a novel prediction-based two-channel image. In our method, the original image is binarized by the structural symmetric pixels (SSPs) method, and the two-channel image is constructed from the original image and its binarized image. In order to explore impact of proposed two-channel images as network inputs, we use two popular CNNs architectures, namely SegNet and U-net. The results presented in this work show that our approach fully outperforms SegNet and U-net when trained by the original images and demonstrates competitiveness and robustness compared with state-of-the-art results using the DIBCO database.","PeriodicalId":325437,"journal":{"name":"2019 International Conference on Document Analysis and Recognition (ICDAR)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":"{\"title\":\"Binarization of Degraded Document Images using Convolutional Neural Networks Based on Predicted Two-Channel Images\",\"authors\":\"Y. Akbari, A. Britto, S. Al-Maadeed, Luiz Oliveira\",\"doi\":\"10.1109/ICDAR.2019.00160\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Due to the poor condition of most of historical documents, binarization is difficult to separate document image background pixels from foreground pixels. This paper proposes Convolutional Neural Networks (CNNs) based on predicted two-channel images in which CNNs are trained to classify the foreground pixels. The promising results from the use of multispectral images for semantic segmentation inspired our efforts to create a novel prediction-based two-channel image. In our method, the original image is binarized by the structural symmetric pixels (SSPs) method, and the two-channel image is constructed from the original image and its binarized image. In order to explore impact of proposed two-channel images as network inputs, we use two popular CNNs architectures, namely SegNet and U-net. The results presented in this work show that our approach fully outperforms SegNet and U-net when trained by the original images and demonstrates competitiveness and robustness compared with state-of-the-art results using the DIBCO database.\",\"PeriodicalId\":325437,\"journal\":{\"name\":\"2019 International Conference on Document Analysis and Recognition (ICDAR)\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"18\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Document Analysis and Recognition (ICDAR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDAR.2019.00160\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Document Analysis and Recognition (ICDAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2019.00160","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Binarization of Degraded Document Images using Convolutional Neural Networks Based on Predicted Two-Channel Images
Due to the poor condition of most of historical documents, binarization is difficult to separate document image background pixels from foreground pixels. This paper proposes Convolutional Neural Networks (CNNs) based on predicted two-channel images in which CNNs are trained to classify the foreground pixels. The promising results from the use of multispectral images for semantic segmentation inspired our efforts to create a novel prediction-based two-channel image. In our method, the original image is binarized by the structural symmetric pixels (SSPs) method, and the two-channel image is constructed from the original image and its binarized image. In order to explore impact of proposed two-channel images as network inputs, we use two popular CNNs architectures, namely SegNet and U-net. The results presented in this work show that our approach fully outperforms SegNet and U-net when trained by the original images and demonstrates competitiveness and robustness compared with state-of-the-art results using the DIBCO database.