A Flourished Approach for Recognizing Text in Digital and Natural Frames

Journal of Engineering Research and Sciences Pub Date : 2024-07-01 DOI:10.55708/js0307005

M. Dutta, Dhonita Tripura, Jugal Krishna Das

{"title":"A Flourished Approach for Recognizing Text in Digital and Natural Frames","authors":"M. Dutta, Dhonita Tripura, Jugal Krishna Das","doi":"10.55708/js0307005","DOIUrl":null,"url":null,"abstract":": Acquiring tenable text detection and recognition outcomes for natural scene images as well as for digital frames is very challenging emulating task. This research approaches a method of text identification for the English language which has advanced significantly, there are particular difficulties when applying these methods to languages such as Bengali because of variations in script, morphology. Text identification and recognition is accomplished on multifarious distinct steps. Firstly, a photo is taken with the help of a device and then, Connected Component Analysis (CCA) and Conditional Random Field (CRF) model are introduced for localization of text components. Secondly, a merged model (region-based Convolutional Neural Network (Mask-R-CNN) and Feature Pyramid Network (FPN)) are used to detect and classify text from images into computerized form. Further, we introduce a combined method of Convolutional Recurrent Neural Network (CRNN), Connectionist Temporal Classification (CTC) with K-Nearest Neighbors (KNN) Algorithm for extracting text from images/ frames. As the goal of this research is to detect and recognize the text using a machine learning-based model a new Fast Iterative Nearest Neighbor (Fast INN) algorithm is now proposed based on patterns and shapes of text components. Our research focuses on a bilingual issue (Bengali and English) as well as it producing satisfactory image experimental outcome with better accuracy and it gives around 98% accuracy for our proposed text recognition methods which is better than the previous studies.","PeriodicalId":484451,"journal":{"name":"Journal of Engineering Research and Sciences","volume":"8 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Engineering Research and Sciences","FirstCategoryId":"0","ListUrlMain":"https://doi.org/10.55708/js0307005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

: Acquiring tenable text detection and recognition outcomes for natural scene images as well as for digital frames is very challenging emulating task. This research approaches a method of text identification for the English language which has advanced significantly, there are particular difficulties when applying these methods to languages such as Bengali because of variations in script, morphology. Text identification and recognition is accomplished on multifarious distinct steps. Firstly, a photo is taken with the help of a device and then, Connected Component Analysis (CCA) and Conditional Random Field (CRF) model are introduced for localization of text components. Secondly, a merged model (region-based Convolutional Neural Network (Mask-R-CNN) and Feature Pyramid Network (FPN)) are used to detect and classify text from images into computerized form. Further, we introduce a combined method of Convolutional Recurrent Neural Network (CRNN), Connectionist Temporal Classification (CTC) with K-Nearest Neighbors (KNN) Algorithm for extracting text from images/ frames. As the goal of this research is to detect and recognize the text using a machine learning-based model a new Fast Iterative Nearest Neighbor (Fast INN) algorithm is now proposed based on patterns and shapes of text components. Our research focuses on a bilingual issue (Bengali and English) as well as it producing satisfactory image experimental outcome with better accuracy and it gives around 98% accuracy for our proposed text recognition methods which is better than the previous studies.

查看原文本刊更多论文

识别数字和自然画面中文字的丰富方法

:为自然场景图像和数字框架获取可靠的文字检测和识别结果是一项极具挑战性的模拟任务。这项研究采用的是已取得重大进展的英语文字识别方法，但将这些方法应用于孟加拉语等语言时会遇到特殊困难，因为孟加拉语的文字和词形都存在差异。文字识别和辨认是通过多个不同步骤完成的。首先，借助设备拍摄照片，然后引入连接成分分析（CCA）和条件随机场（CRF）模型来定位文本成分。其次，我们使用合并模型（基于区域的卷积神经网络（Mask-R-CNN）和特征金字塔网络（FPN））来检测图像中的文字并将其分类为计算机格式。此外，我们还引入了卷积递归神经网络（CRNN）、连接时序分类（CTC）与 KNN 算法（K-Nearest Neighbors）的组合方法，用于从图像/帧中提取文本。由于本研究的目标是使用基于机器学习的模型检测和识别文本，因此现在提出了一种基于文本成分的模式和形状的新型快速迭代近邻（Fast INN）算法。我们的研究重点是双语问题（孟加拉语和英语），它产生了令人满意的图像实验结果，准确率更高，我们提出的文本识别方法的准确率约为 98%，优于之前的研究。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Engineering Research and Sciences

自引率

0.00%

发文量