Doggucan Yaman, Fevziye Irem Eyiokur, H. K. Ekenel
{"title":"Comparison of convolutional neural network models for document image classification","authors":"Doggucan Yaman, Fevziye Irem Eyiokur, H. K. Ekenel","doi":"10.1109/SIU.2017.7960562","DOIUrl":null,"url":null,"abstract":"Despite the increase in digitization, the use of documents is still very common today. It is essential that these documents are correctly labeled and classified for their need to be archived in an accessible manner. In this study, we used state-of-the-art convolutional neural network models to satisfy this need. Convolutional Neural Networks achieve high performance compared to alternative methods in the field of classification, due to the strong and rich features they can learn from large data through deep architecture. For the experiments, we have used a dataset containing 400,000 images of 16 different document classes. The state-of-the-art deep learning models have been fine-tuned and compared in detail. VGG-16 architecture has achieved the best performance on this dataset with 90.93% correct classification rate.","PeriodicalId":217576,"journal":{"name":"2017 25th Signal Processing and Communications Applications Conference (SIU)","volume":"32 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 25th Signal Processing and Communications Applications Conference (SIU)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIU.2017.7960562","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Despite the increase in digitization, the use of documents is still very common today. It is essential that these documents are correctly labeled and classified for their need to be archived in an accessible manner. In this study, we used state-of-the-art convolutional neural network models to satisfy this need. Convolutional Neural Networks achieve high performance compared to alternative methods in the field of classification, due to the strong and rich features they can learn from large data through deep architecture. For the experiments, we have used a dataset containing 400,000 images of 16 different document classes. The state-of-the-art deep learning models have been fine-tuned and compared in detail. VGG-16 architecture has achieved the best performance on this dataset with 90.93% correct classification rate.