W. Mustafa, H. Aziz, W. Khairunizam, Zunaidi Ibrahim, A. Shahriman, Z. Razlan
{"title":"Review of Different Binarization Approaches on Degraded Document Images","authors":"W. Mustafa, H. Aziz, W. Khairunizam, Zunaidi Ibrahim, A. Shahriman, Z. Razlan","doi":"10.1109/ICASSDA.2018.8477621","DOIUrl":null,"url":null,"abstract":"Binarization is used to read text documents automatically by using optical character recognition. It is a very important step to segment foreground text form background images. Binarization processes become a challenging task when it comes to old document images which usually suffer from degradations. The different types of document degradation such as uneven illumination, image contrast variation and bleeding-through, binarization surely become an enormous challenge for all researchers. Binary image representation is the essential format for document analysis. This paper presents comparisons of several image binarization techniques in order to find the best approach for the binarizing document image. Several binarization techniques such as Bernsen, Multiple Thresholding, Deghost, Fuzzy C-Means and Triangle methods have been selected for comparison and applied on H-DIBCO 2013 dataset. According to the image quality assessment (IQA) results, it is obvious to state that the Fuzzy C-Means method is successful and effective compared to other methods. Hence, the implications of this image analysis would give researchers a direction for future research.","PeriodicalId":185167,"journal":{"name":"2018 International Conference on Computational Approach in Smart Systems Design and Applications (ICASSDA)","volume":"207 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Computational Approach in Smart Systems Design and Applications (ICASSDA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSDA.2018.8477621","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
Abstract
Binarization is used to read text documents automatically by using optical character recognition. It is a very important step to segment foreground text form background images. Binarization processes become a challenging task when it comes to old document images which usually suffer from degradations. The different types of document degradation such as uneven illumination, image contrast variation and bleeding-through, binarization surely become an enormous challenge for all researchers. Binary image representation is the essential format for document analysis. This paper presents comparisons of several image binarization techniques in order to find the best approach for the binarizing document image. Several binarization techniques such as Bernsen, Multiple Thresholding, Deghost, Fuzzy C-Means and Triangle methods have been selected for comparison and applied on H-DIBCO 2013 dataset. According to the image quality assessment (IQA) results, it is obvious to state that the Fuzzy C-Means method is successful and effective compared to other methods. Hence, the implications of this image analysis would give researchers a direction for future research.