A Flourished Approach for Recognizing Text in Digital and Natural Frames

M. Dutta, Dhonita Tripura, Jugal Krishna Das
{"title":"A Flourished Approach for Recognizing Text in Digital and Natural Frames","authors":"M. Dutta, Dhonita Tripura, Jugal Krishna Das","doi":"10.55708/js0307005","DOIUrl":null,"url":null,"abstract":": Acquiring tenable text detection and recognition outcomes for natural scene images as well as for digital frames is very challenging emulating task. This research approaches a method of text identification for the English language which has advanced significantly, there are particular difficulties when applying these methods to languages such as Bengali because of variations in script, morphology. Text identification and recognition is accomplished on multifarious distinct steps. Firstly, a photo is taken with the help of a device and then, Connected Component Analysis (CCA) and Conditional Random Field (CRF) model are introduced for localization of text components. Secondly, a merged model (region-based Convolutional Neural Network (Mask-R-CNN) and Feature Pyramid Network (FPN)) are used to detect and classify text from images into computerized form. Further, we introduce a combined method of Convolutional Recurrent Neural Network (CRNN), Connectionist Temporal Classification (CTC) with K-Nearest Neighbors (KNN) Algorithm for extracting text from images/ frames. As the goal of this research is to detect and recognize the text using a machine learning-based model a new Fast Iterative Nearest Neighbor (Fast INN) algorithm is now proposed based on patterns and shapes of text components. Our research focuses on a bilingual issue (Bengali and English) as well as it producing satisfactory image experimental outcome with better accuracy and it gives around 98% accuracy for our proposed text recognition methods which is better than the previous studies.","PeriodicalId":484451,"journal":{"name":"Journal of Engineering Research and Sciences","volume":"8 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Engineering Research and Sciences","FirstCategoryId":"0","ListUrlMain":"https://doi.org/10.55708/js0307005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

: Acquiring tenable text detection and recognition outcomes for natural scene images as well as for digital frames is very challenging emulating task. This research approaches a method of text identification for the English language which has advanced significantly, there are particular difficulties when applying these methods to languages such as Bengali because of variations in script, morphology. Text identification and recognition is accomplished on multifarious distinct steps. Firstly, a photo is taken with the help of a device and then, Connected Component Analysis (CCA) and Conditional Random Field (CRF) model are introduced for localization of text components. Secondly, a merged model (region-based Convolutional Neural Network (Mask-R-CNN) and Feature Pyramid Network (FPN)) are used to detect and classify text from images into computerized form. Further, we introduce a combined method of Convolutional Recurrent Neural Network (CRNN), Connectionist Temporal Classification (CTC) with K-Nearest Neighbors (KNN) Algorithm for extracting text from images/ frames. As the goal of this research is to detect and recognize the text using a machine learning-based model a new Fast Iterative Nearest Neighbor (Fast INN) algorithm is now proposed based on patterns and shapes of text components. Our research focuses on a bilingual issue (Bengali and English) as well as it producing satisfactory image experimental outcome with better accuracy and it gives around 98% accuracy for our proposed text recognition methods which is better than the previous studies.
识别数字和自然画面中文字的丰富方法
:为自然场景图像和数字框架获取可靠的文字检测和识别结果是一项极具挑战性的模拟任务。这项研究采用的是已取得重大进展的英语文字识别方法,但将这些方法应用于孟加拉语等语言时会遇到特殊困难,因为孟加拉语的文字和词形都存在差异。文字识别和辨认是通过多个不同步骤完成的。首先,借助设备拍摄照片,然后引入连接成分分析(CCA)和条件随机场(CRF)模型来定位文本成分。其次,我们使用合并模型(基于区域的卷积神经网络(Mask-R-CNN)和特征金字塔网络(FPN))来检测图像中的文字并将其分类为计算机格式。此外,我们还引入了卷积递归神经网络(CRNN)、连接时序分类(CTC)与 KNN 算法(K-Nearest Neighbors)的组合方法,用于从图像/帧中提取文本。由于本研究的目标是使用基于机器学习的模型检测和识别文本,因此现在提出了一种基于文本成分的模式和形状的新型快速迭代近邻(Fast INN)算法。我们的研究重点是双语问题(孟加拉语和英语),它产生了令人满意的图像实验结果,准确率更高,我们提出的文本识别方法的准确率约为 98%,优于之前的研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信