Performance Evaluation of Advanced Deep Learning Architectures for Offline Handwritten Character Recognition

2017 International Conference on Frontiers of Information Technology (FIT) Pub Date : 2017-12-01 DOI:10.1109/FIT.2017.00071

Moazam Soomro, Muhammad Ali Farooq, R. H. Raza

{"title":"Performance Evaluation of Advanced Deep Learning Architectures for Offline Handwritten Character Recognition","authors":"Moazam Soomro, Muhammad Ali Farooq, R. H. Raza","doi":"10.1109/FIT.2017.00071","DOIUrl":null,"url":null,"abstract":"This paper presents a hand-written character recognition comparison and performance evaluation for robust and precise classification of different hand-written characters. The system utilizes advanced multilayer deep neural network by collecting features from raw pixel values. The hidden layers stack deep hierarchies of non-linear features since learning complex features from conventional neural networks is very challenging. Two state of the art deep learning architectures were used which includes Caffe AlexNet [5] and GoogleNet models [6] in NVIDIA DIGITS [10]. The frameworks were trained and tested on two different datasets for incorporating diversity and complexity. One of them is the publicly available dataset i.e. Chars74K [4] comprising of 7705 characters and has upper and lowercase English alphabets, along with numerical digits. While the other dataset created locally consists of 4320 characters. The local dataset consists of 62 classes and was created by 40 subjects. It also consists upper and lowercase English alphabets, along with numerical digits. The overall dataset is divided in the ratio of 80% for training and 20% for testing phase. The time required for training phase is approximately 90 minutes. For validation part, the results obtained were compared with the ground-truth. The accuracy level achieved with AlexNet was 77.77% and 88.89% with Google Net. The higher accuracy level of GoogleNet is due to its unique combination of inception modules, each including pooling, convolutions at various scales and concatenation procedures.","PeriodicalId":107273,"journal":{"name":"2017 International Conference on Frontiers of Information Technology (FIT)","volume":"40 2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Frontiers of Information Technology (FIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FIT.2017.00071","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 13

Abstract

This paper presents a hand-written character recognition comparison and performance evaluation for robust and precise classification of different hand-written characters. The system utilizes advanced multilayer deep neural network by collecting features from raw pixel values. The hidden layers stack deep hierarchies of non-linear features since learning complex features from conventional neural networks is very challenging. Two state of the art deep learning architectures were used which includes Caffe AlexNet [5] and GoogleNet models [6] in NVIDIA DIGITS [10]. The frameworks were trained and tested on two different datasets for incorporating diversity and complexity. One of them is the publicly available dataset i.e. Chars74K [4] comprising of 7705 characters and has upper and lowercase English alphabets, along with numerical digits. While the other dataset created locally consists of 4320 characters. The local dataset consists of 62 classes and was created by 40 subjects. It also consists upper and lowercase English alphabets, along with numerical digits. The overall dataset is divided in the ratio of 80% for training and 20% for testing phase. The time required for training phase is approximately 90 minutes. For validation part, the results obtained were compared with the ground-truth. The accuracy level achieved with AlexNet was 77.77% and 88.89% with Google Net. The higher accuracy level of GoogleNet is due to its unique combination of inception modules, each including pooling, convolutions at various scales and concatenation procedures.

查看原文本刊更多论文

用于离线手写字符识别的高级深度学习架构的性能评估

为了对不同的手写体进行鲁棒和精确的分类，本文提出了一种手写体识别的比较和性能评价方法。该系统利用先进的多层深度神经网络，从原始像素值中收集特征。由于从传统神经网络中学习复杂特征非常具有挑战性，因此隐藏层叠加了非线性特征的深层层次。我们使用了两种最先进的深度学习架构，包括NVIDIA DIGITS[10]中的Caffe AlexNet[5]和GoogleNet模型[6]。这些框架在两个不同的数据集上进行了训练和测试，以纳入多样性和复杂性。其中之一是公开可用的数据集，即Chars74K[4]，由7705个字符组成，具有大小写英文字母以及数字。而本地创建的另一个数据集由4320个字符组成。本地数据集由62个类组成，由40个主题创建。它还包括大写和小写英文字母，以及数字。整个数据集被分割成80%用于训练阶段，20%用于测试阶段。训练阶段所需时间约为90分钟。在验证部分，将得到的结果与地面真实值进行了比较。AlexNet和Google Net的准确率分别为77.77%和88.89%。GoogleNet的更高精度水平是由于其独特的初始模块组合，每个模块包括池化，各种规模的卷积和连接过程。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 International Conference on Frontiers of Information Technology (FIT)

自引率

0.00%

发文量