Deep Learning Based Marathi Sentence Recognition using Devnagari Character Identification

Rupal S. Patil, Bhairav Narkhede, Stuti Gaonkar, Tirth Dave
{"title":"Deep Learning Based Marathi Sentence Recognition using Devnagari Character Identification","authors":"Rupal S. Patil, Bhairav Narkhede, Stuti Gaonkar, Tirth Dave","doi":"10.1109/CSCITA55725.2023.10104985","DOIUrl":null,"url":null,"abstract":"There are multiple algorithms available to recognize Marathi Devnagari characters. Most of these methods are limited because of the large variety of character variations due to Kana, Matra, Ukar, Velanti, and Anusvar, which are specific to the Marathi grammar called Barakhadi. There is a need to have a dictionary-based word formulation to achieve full Marathi sentence recognition. In the proposed work, a Marathi sentence is recognized using a combination of full 454 variation detection of Devnagari characters and nearest dictionary word mapping using the k-nearest neighbour (KNN) model to achieve full sentence recognition. This is the first time full 454 (Vyanjan variation as per Barakhadi) character recognition instead of the traditional 58 characters (Vyanjans) has been attempted which leads to sentence recognition. The proposed method could achieve a sentence recognition accuracy of 86.84%, a 454 character classification accuracy was 89.52%, and the execution speed of the proposed system was 1.464 secs per word. For the training of the character recognition network, a separate dataset was created for all Vyanjan variations as per Barakhadi. This novel contribution of the proposed system will surely inspire researchers to explore Devnagari sentence recognition.","PeriodicalId":224479,"journal":{"name":"2023 International Conference on Communication System, Computing and IT Applications (CSCITA)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on Communication System, Computing and IT Applications (CSCITA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSCITA55725.2023.10104985","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

There are multiple algorithms available to recognize Marathi Devnagari characters. Most of these methods are limited because of the large variety of character variations due to Kana, Matra, Ukar, Velanti, and Anusvar, which are specific to the Marathi grammar called Barakhadi. There is a need to have a dictionary-based word formulation to achieve full Marathi sentence recognition. In the proposed work, a Marathi sentence is recognized using a combination of full 454 variation detection of Devnagari characters and nearest dictionary word mapping using the k-nearest neighbour (KNN) model to achieve full sentence recognition. This is the first time full 454 (Vyanjan variation as per Barakhadi) character recognition instead of the traditional 58 characters (Vyanjans) has been attempted which leads to sentence recognition. The proposed method could achieve a sentence recognition accuracy of 86.84%, a 454 character classification accuracy was 89.52%, and the execution speed of the proposed system was 1.464 secs per word. For the training of the character recognition network, a separate dataset was created for all Vyanjan variations as per Barakhadi. This novel contribution of the proposed system will surely inspire researchers to explore Devnagari sentence recognition.
基于Devnagari字符识别的深度学习马拉地语句子识别
有多种算法可用于识别马拉地语Devnagari字符。由于Kana, Matra, Ukar, Velanti和Anusvar的大量字符变化,这些都是马拉地语语法(称为Barakhadi)所特有的,因此大多数这些方法都是有限的。为了实现完整的马拉地语句子识别,需要有一个基于字典的单词公式。在提出的工作中,使用Devnagari字符的全454变体检测和使用k-最近邻(KNN)模型的最近字典单词映射相结合来识别马拉地语句子,以实现完整的句子识别。这是第一次尝试完整的454个(根据Barakhadi的vyanjanans变体)字符识别,而不是传统的58个字符(vyanjanans),从而导致句子识别。该方法的句子识别准确率为86.84%,454个字符分类准确率为89.52%,系统执行速度为1.464秒/词。为了训练字符识别网络,根据Barakhadi的说法,为所有Vyanjan变体创建了一个单独的数据集。该系统的这一新颖贡献必将激励研究人员探索Devnagari句子识别。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信