使用深度CNN和极限学习机的实时文档图像分类

Andreas Kölsch, Muhammad Zeshan Afzal, Markus Ebbecke, M. Liwicki
{"title":"使用深度CNN和极限学习机的实时文档图像分类","authors":"Andreas Kölsch, Muhammad Zeshan Afzal, Markus Ebbecke, M. Liwicki","doi":"10.1109/ICDAR.2017.217","DOIUrl":null,"url":null,"abstract":"This paper presents an approach for real-time training and testing for document image classification. In production environments, it is crucial to perform accurate and (time-)efficient training. Existing deep learning approaches for classifying documents do not meet these requirements, as they require much time for training and fine-tuning the deep architectures. Motivated from Computer Vision, we propose a two-stage approach. The first stage trains a deep network that works as feature extractor and in the second stage, Extreme Learning Machines (ELMs) are used for classification. The proposed approach outperforms all previously reported structural and deep learning based methods with a final accuracy of 83.24% on Tobacco-3482 dataset, leading to a relative error reduction of 25% when compared to a previous Convolutional Neural Network (CNN) based approach (DeepDocClassifier). More importantly, the training time of the ELM is only 1.176 seconds and the overall prediction time for 2,482 images is 3.066 seconds. As such, this novel approach makes deep learning-based document classification suitable for large-scale real-time applications.","PeriodicalId":433676,"journal":{"name":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"61","resultStr":"{\"title\":\"Real-Time Document Image Classification Using Deep CNN and Extreme Learning Machines\",\"authors\":\"Andreas Kölsch, Muhammad Zeshan Afzal, Markus Ebbecke, M. Liwicki\",\"doi\":\"10.1109/ICDAR.2017.217\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents an approach for real-time training and testing for document image classification. In production environments, it is crucial to perform accurate and (time-)efficient training. Existing deep learning approaches for classifying documents do not meet these requirements, as they require much time for training and fine-tuning the deep architectures. Motivated from Computer Vision, we propose a two-stage approach. The first stage trains a deep network that works as feature extractor and in the second stage, Extreme Learning Machines (ELMs) are used for classification. The proposed approach outperforms all previously reported structural and deep learning based methods with a final accuracy of 83.24% on Tobacco-3482 dataset, leading to a relative error reduction of 25% when compared to a previous Convolutional Neural Network (CNN) based approach (DeepDocClassifier). More importantly, the training time of the ELM is only 1.176 seconds and the overall prediction time for 2,482 images is 3.066 seconds. As such, this novel approach makes deep learning-based document classification suitable for large-scale real-time applications.\",\"PeriodicalId\":433676,\"journal\":{\"name\":\"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"61\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDAR.2017.217\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2017.217","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 61

摘要

本文提出了一种用于文档图像分类的实时训练和测试方法。在生产环境中,执行准确且(时间)有效的培训是至关重要的。现有的用于文档分类的深度学习方法不满足这些要求,因为它们需要大量的时间来训练和微调深度架构。受计算机视觉的启发,我们提出了一个两阶段的方法。第一阶段训练作为特征提取器的深度网络,第二阶段使用极限学习机(elm)进行分类。所提出的方法优于之前报道的所有基于结构和深度学习的方法,在Tobacco-3482数据集上的最终准确率为83.24%,与之前基于卷积神经网络(CNN)的方法(DeepDocClassifier)相比,相对误差降低了25%。更重要的是,ELM的训练时间仅为1.176秒,对2482张图像的整体预测时间为3.066秒。因此,这种新颖的方法使得基于深度学习的文档分类适合于大规模的实时应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Real-Time Document Image Classification Using Deep CNN and Extreme Learning Machines
This paper presents an approach for real-time training and testing for document image classification. In production environments, it is crucial to perform accurate and (time-)efficient training. Existing deep learning approaches for classifying documents do not meet these requirements, as they require much time for training and fine-tuning the deep architectures. Motivated from Computer Vision, we propose a two-stage approach. The first stage trains a deep network that works as feature extractor and in the second stage, Extreme Learning Machines (ELMs) are used for classification. The proposed approach outperforms all previously reported structural and deep learning based methods with a final accuracy of 83.24% on Tobacco-3482 dataset, leading to a relative error reduction of 25% when compared to a previous Convolutional Neural Network (CNN) based approach (DeepDocClassifier). More importantly, the training time of the ELM is only 1.176 seconds and the overall prediction time for 2,482 images is 3.066 seconds. As such, this novel approach makes deep learning-based document classification suitable for large-scale real-time applications.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信