Shaimaa S. A. Mohamed, M. Rashwan, Sherif M. Abdou, Hassanin M. Al-Barhamtoshy
{"title":"基于补丁的文档去噪","authors":"Shaimaa S. A. Mohamed, M. Rashwan, Sherif M. Abdou, Hassanin M. Al-Barhamtoshy","doi":"10.1109/JEC-ECC.2018.8679566","DOIUrl":null,"url":null,"abstract":"Document denoising is one of the most challenging tasks in any optical character recognition system, especially when the noise type is different from white noise. Noise types are wide and, hence an effective denoising algorithm should be able to deal with different noise types. This paper introduces two denoising algorithms that are able to remove noise from Arabic documents. Our approaches are based on sparse representations over a learned dictionary and denoising auto-encoders, which have provided best state of the art results for natural images denoising. The experiments show that those two algorithms are promising in document denoising, as they provide the ability of learning a prior knowledge of clean character models to use them in the denoising process.","PeriodicalId":197824,"journal":{"name":"2018 International Japan-Africa Conference on Electronics, Communications and Computations (JAC-ECC)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Patch-Based Document Denoising\",\"authors\":\"Shaimaa S. A. Mohamed, M. Rashwan, Sherif M. Abdou, Hassanin M. Al-Barhamtoshy\",\"doi\":\"10.1109/JEC-ECC.2018.8679566\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Document denoising is one of the most challenging tasks in any optical character recognition system, especially when the noise type is different from white noise. Noise types are wide and, hence an effective denoising algorithm should be able to deal with different noise types. This paper introduces two denoising algorithms that are able to remove noise from Arabic documents. Our approaches are based on sparse representations over a learned dictionary and denoising auto-encoders, which have provided best state of the art results for natural images denoising. The experiments show that those two algorithms are promising in document denoising, as they provide the ability of learning a prior knowledge of clean character models to use them in the denoising process.\",\"PeriodicalId\":197824,\"journal\":{\"name\":\"2018 International Japan-Africa Conference on Electronics, Communications and Computations (JAC-ECC)\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 International Japan-Africa Conference on Electronics, Communications and Computations (JAC-ECC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/JEC-ECC.2018.8679566\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Japan-Africa Conference on Electronics, Communications and Computations (JAC-ECC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/JEC-ECC.2018.8679566","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Document denoising is one of the most challenging tasks in any optical character recognition system, especially when the noise type is different from white noise. Noise types are wide and, hence an effective denoising algorithm should be able to deal with different noise types. This paper introduces two denoising algorithms that are able to remove noise from Arabic documents. Our approaches are based on sparse representations over a learned dictionary and denoising auto-encoders, which have provided best state of the art results for natural images denoising. The experiments show that those two algorithms are promising in document denoising, as they provide the ability of learning a prior knowledge of clean character models to use them in the denoising process.