基于Devnagari字符识别的深度学习马拉地语句子识别

2023 International Conference on Communication System, Computing and IT Applications (CSCITA) Pub Date : 2023-03-31 DOI:10.1109/CSCITA55725.2023.10104985

Rupal S. Patil, Bhairav Narkhede, Stuti Gaonkar, Tirth Dave

{"title":"基于Devnagari字符识别的深度学习马拉地语句子识别","authors":"Rupal S. Patil, Bhairav Narkhede, Stuti Gaonkar, Tirth Dave","doi":"10.1109/CSCITA55725.2023.10104985","DOIUrl":null,"url":null,"abstract":"There are multiple algorithms available to recognize Marathi Devnagari characters. Most of these methods are limited because of the large variety of character variations due to Kana, Matra, Ukar, Velanti, and Anusvar, which are specific to the Marathi grammar called Barakhadi. There is a need to have a dictionary-based word formulation to achieve full Marathi sentence recognition. In the proposed work, a Marathi sentence is recognized using a combination of full 454 variation detection of Devnagari characters and nearest dictionary word mapping using the k-nearest neighbour (KNN) model to achieve full sentence recognition. This is the first time full 454 (Vyanjan variation as per Barakhadi) character recognition instead of the traditional 58 characters (Vyanjans) has been attempted which leads to sentence recognition. The proposed method could achieve a sentence recognition accuracy of 86.84%, a 454 character classification accuracy was 89.52%, and the execution speed of the proposed system was 1.464 secs per word. For the training of the character recognition network, a separate dataset was created for all Vyanjan variations as per Barakhadi. This novel contribution of the proposed system will surely inspire researchers to explore Devnagari sentence recognition.","PeriodicalId":224479,"journal":{"name":"2023 International Conference on Communication System, Computing and IT Applications (CSCITA)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep Learning Based Marathi Sentence Recognition using Devnagari Character Identification\",\"authors\":\"Rupal S. Patil, Bhairav Narkhede, Stuti Gaonkar, Tirth Dave\",\"doi\":\"10.1109/CSCITA55725.2023.10104985\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"There are multiple algorithms available to recognize Marathi Devnagari characters. Most of these methods are limited because of the large variety of character variations due to Kana, Matra, Ukar, Velanti, and Anusvar, which are specific to the Marathi grammar called Barakhadi. There is a need to have a dictionary-based word formulation to achieve full Marathi sentence recognition. In the proposed work, a Marathi sentence is recognized using a combination of full 454 variation detection of Devnagari characters and nearest dictionary word mapping using the k-nearest neighbour (KNN) model to achieve full sentence recognition. This is the first time full 454 (Vyanjan variation as per Barakhadi) character recognition instead of the traditional 58 characters (Vyanjans) has been attempted which leads to sentence recognition. The proposed method could achieve a sentence recognition accuracy of 86.84%, a 454 character classification accuracy was 89.52%, and the execution speed of the proposed system was 1.464 secs per word. For the training of the character recognition network, a separate dataset was created for all Vyanjan variations as per Barakhadi. This novel contribution of the proposed system will surely inspire researchers to explore Devnagari sentence recognition.\",\"PeriodicalId\":224479,\"journal\":{\"name\":\"2023 International Conference on Communication System, Computing and IT Applications (CSCITA)\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 International Conference on Communication System, Computing and IT Applications (CSCITA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CSCITA55725.2023.10104985\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on Communication System, Computing and IT Applications (CSCITA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSCITA55725.2023.10104985","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

有多种算法可用于识别马拉地语Devnagari字符。由于Kana, Matra, Ukar, Velanti和Anusvar的大量字符变化，这些都是马拉地语语法(称为Barakhadi)所特有的，因此大多数这些方法都是有限的。为了实现完整的马拉地语句子识别，需要有一个基于字典的单词公式。在提出的工作中，使用Devnagari字符的全454变体检测和使用k-最近邻(KNN)模型的最近字典单词映射相结合来识别马拉地语句子，以实现完整的句子识别。这是第一次尝试完整的454个(根据Barakhadi的vyanjanans变体)字符识别，而不是传统的58个字符(vyanjanans)，从而导致句子识别。该方法的句子识别准确率为86.84%，454个字符分类准确率为89.52%，系统执行速度为1.464秒/词。为了训练字符识别网络，根据Barakhadi的说法，为所有Vyanjan变体创建了一个单独的数据集。该系统的这一新颖贡献必将激励研究人员探索Devnagari句子识别。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Deep Learning Based Marathi Sentence Recognition using Devnagari Character Identification

There are multiple algorithms available to recognize Marathi Devnagari characters. Most of these methods are limited because of the large variety of character variations due to Kana, Matra, Ukar, Velanti, and Anusvar, which are specific to the Marathi grammar called Barakhadi. There is a need to have a dictionary-based word formulation to achieve full Marathi sentence recognition. In the proposed work, a Marathi sentence is recognized using a combination of full 454 variation detection of Devnagari characters and nearest dictionary word mapping using the k-nearest neighbour (KNN) model to achieve full sentence recognition. This is the first time full 454 (Vyanjan variation as per Barakhadi) character recognition instead of the traditional 58 characters (Vyanjans) has been attempted which leads to sentence recognition. The proposed method could achieve a sentence recognition accuracy of 86.84%, a 454 character classification accuracy was 89.52%, and the execution speed of the proposed system was 1.464 secs per word. For the training of the character recognition network, a separate dataset was created for all Vyanjan variations as per Barakhadi. This novel contribution of the proposed system will surely inspire researchers to explore Devnagari sentence recognition.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 International Conference on Communication System, Computing and IT Applications (CSCITA)

自引率

0.00%

发文量