Optical character recognition (OCR) using partial least square (PLS) based feature reduction: an application to artificial intelligence for biometric identification

Zainab Akhtar, Jong Weon Lee, Muhammad Attique Khan, M. Sharif, S. Khan, Naveed Riaz
{"title":"Optical character recognition (OCR) using partial least square (PLS) based feature reduction: an application to artificial intelligence for biometric identification","authors":"Zainab Akhtar, Jong Weon Lee, Muhammad Attique Khan, M. Sharif, S. Khan, Naveed Riaz","doi":"10.1108/jeim-02-2020-0076","DOIUrl":null,"url":null,"abstract":"PurposeIn artificial intelligence, the optical character recognition (OCR) is an active research area based on famous applications such as automation and transformation of printed documents into machine-readable text document. The major purpose of OCR in academia and banks is to achieve a significant performance to save storage space.Design/methodology/approachA novel technique is proposed for automated OCR based on multi-properties features fusion and selection. The features are fused using serially formulation and output passed to partial least square (PLS) based selection method. The selection is done based on the entropy fitness function. The final features are classified by an ensemble classifier.FindingsThe presented method was extensively tested on two datasets such as the authors proposed and Chars74k benchmark and achieved an accuracy of 91.2 and 99.9%. Comparing the results with existing techniques, it is found that the proposed method gives improved performance.Originality/valueThe technique presented in this work will help for license plate recognition and text conversion from a printed document to machine-readable.","PeriodicalId":390951,"journal":{"name":"J. Enterp. Inf. Manag.","volume":"94 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Enterp. Inf. Manag.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1108/jeim-02-2020-0076","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

Abstract

PurposeIn artificial intelligence, the optical character recognition (OCR) is an active research area based on famous applications such as automation and transformation of printed documents into machine-readable text document. The major purpose of OCR in academia and banks is to achieve a significant performance to save storage space.Design/methodology/approachA novel technique is proposed for automated OCR based on multi-properties features fusion and selection. The features are fused using serially formulation and output passed to partial least square (PLS) based selection method. The selection is done based on the entropy fitness function. The final features are classified by an ensemble classifier.FindingsThe presented method was extensively tested on two datasets such as the authors proposed and Chars74k benchmark and achieved an accuracy of 91.2 and 99.9%. Comparing the results with existing techniques, it is found that the proposed method gives improved performance.Originality/valueThe technique presented in this work will help for license plate recognition and text conversion from a printed document to machine-readable.
基于偏最小二乘(PLS)特征约简的光学字符识别(OCR):在人工智能生物识别中的应用
在人工智能领域,光学字符识别(OCR)是一个活跃的研究领域,其基础是将打印文档自动化和转换为机器可读文本文档等著名应用。学术界和银行使用OCR的主要目的是为了获得显著的性能以节省存储空间。提出了一种基于多属性特征融合和选择的自动OCR技术。特征融合使用序列公式和输出传递给偏最小二乘(PLS)的选择方法。选择基于熵适应度函数。最后的特征由集成分类器分类。该方法在作者提出的和Chars74k基准两个数据集上进行了广泛的测试,准确率分别达到了91.2%和99.9%。结果表明,该方法与现有方法的性能有较大的提高。独创性/价值本工作中提出的技术将有助于车牌识别和从打印文档到机器可读的文本转换。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信