Manuscripts Character Recognition Using Machine Learning and Deep Learning

WIT transactions on modelling and simulation Pub Date : 2023-04-04 DOI:10.3390/modelling4020010

Mohammad Anwarul Islam, I. Iacob

{"title":"Manuscripts Character Recognition Using Machine Learning and Deep Learning","authors":"Mohammad Anwarul Islam, I. Iacob","doi":"10.3390/modelling4020010","DOIUrl":null,"url":null,"abstract":"The automatic character recognition of historic documents gained more attention from scholars recently, due to the big improvements in computer vision, image processing, and digitization. While Neural Networks, the current state-of-the-art models used for image recognition, are very performant, they typically suffer from using large amounts of training data. In our study we manually built our own relatively small dataset of 404 characters by cropping letter images from a popular historic manuscript, the Electronic Beowulf. To compensate for the small dataset we use ImageDataGenerator, a Python library was used to augment our Beowulf manuscript’s dataset. The training dataset was augmented once, twice, and thrice, which we call resampling 1, resampling 2, and resampling 3, respectively. To classify the manuscript’s character images efficiently, we developed a customized Convolutional Neural Network (CNN) model. We conducted a comparative analysis of the results achieved by our proposed model with other machine learning (ML) models such as support vector machine (SVM), K-nearest neighbor (KNN), decision tree (DT), random forest (RF), and XGBoost. We used pretrained models such as VGG16, MobileNet, and ResNet50 to extract features from character images. We then trained and tested the above ML models and recorded the results. Moreover, we validated our proposed CNN model against the well-established MNIST dataset. Our proposed CNN model achieves very good recognition accuracies of 88.67%, 90.91%, and 98.86% in the cases of resampling 1, resampling 2, and resampling 3, respectively, for the Beowulf manuscript’s data. Additionally, our CNN model achieves the benchmark recognition accuracy of 99.03% for the MNIST dataset.","PeriodicalId":89310,"journal":{"name":"WIT transactions on modelling and simulation","volume":"100 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"WIT transactions on modelling and simulation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/modelling4020010","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

The automatic character recognition of historic documents gained more attention from scholars recently, due to the big improvements in computer vision, image processing, and digitization. While Neural Networks, the current state-of-the-art models used for image recognition, are very performant, they typically suffer from using large amounts of training data. In our study we manually built our own relatively small dataset of 404 characters by cropping letter images from a popular historic manuscript, the Electronic Beowulf. To compensate for the small dataset we use ImageDataGenerator, a Python library was used to augment our Beowulf manuscript’s dataset. The training dataset was augmented once, twice, and thrice, which we call resampling 1, resampling 2, and resampling 3, respectively. To classify the manuscript’s character images efficiently, we developed a customized Convolutional Neural Network (CNN) model. We conducted a comparative analysis of the results achieved by our proposed model with other machine learning (ML) models such as support vector machine (SVM), K-nearest neighbor (KNN), decision tree (DT), random forest (RF), and XGBoost. We used pretrained models such as VGG16, MobileNet, and ResNet50 to extract features from character images. We then trained and tested the above ML models and recorded the results. Moreover, we validated our proposed CNN model against the well-established MNIST dataset. Our proposed CNN model achieves very good recognition accuracies of 88.67%, 90.91%, and 98.86% in the cases of resampling 1, resampling 2, and resampling 3, respectively, for the Beowulf manuscript’s data. Additionally, our CNN model achieves the benchmark recognition accuracy of 99.03% for the MNIST dataset.

查看原文本刊更多论文

使用机器学习和深度学习的手稿字符识别

近年来，由于计算机视觉、图像处理和数字化技术的进步，历史文献的字符自动识别越来越受到学者们的关注。虽然目前用于图像识别的最先进的神经网络模型非常高效，但它们通常受到使用大量训练数据的影响。在我们的研究中，我们通过裁剪流行的历史手稿《电子贝奥武夫》中的字母图像，手动构建了我们自己相对较小的404个字符的数据集。为了弥补小数据集的不足，我们使用了imagedataggenerator，这是一个Python库，用于扩展贝奥武夫手稿的数据集。训练数据集被增强了一次、两次和三次，我们分别称之为重新采样1、重新采样2和重新采样3。为了有效地对手稿中的文字图像进行分类，我们开发了一个定制的卷积神经网络(CNN)模型。我们将我们提出的模型与其他机器学习(ML)模型(如支持向量机(SVM)、k近邻(KNN)、决策树(DT)、随机森林(RF)和XGBoost)的结果进行了比较分析。我们使用VGG16、MobileNet和ResNet50等预训练模型从字符图像中提取特征。然后对上述ML模型进行训练和测试，并记录结果。此外，我们针对完善的MNIST数据集验证了我们提出的CNN模型。对于贝奥武夫手稿的数据，我们提出的CNN模型在重采样1、重采样2和重采样3的情况下，分别获得了88.67%、90.91%和98.86%的非常好的识别准确率。此外，我们的CNN模型在MNIST数据集上达到了99.03%的基准识别准确率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

WIT transactions on modelling and simulation

自引率

0.00%

发文量