Summary of Spoken Indian Languages Classification Using ML and DL

2022 6th International Conference on Electronics, Communication and Aerospace Technology Pub Date : 2022-12-01 DOI:10.1109/ICECA55336.2022.10009380

Riya Shah, Barkha M. Joshi, J. Shah, Milin M Patel, A. Rana, Ronak Roy

{"title":"Summary of Spoken Indian Languages Classification Using ML and DL","authors":"Riya Shah, Barkha M. Joshi, J. Shah, Milin M Patel, A. Rana, Ronak Roy","doi":"10.1109/ICECA55336.2022.10009380","DOIUrl":null,"url":null,"abstract":"Unlike in some other parts of the world, speech recognition technology is legal in the West. It's not to the same degree that this happens in East Asian countries. It's possible that linguistic barriers are a major cause of this chasm. In addition, countries with many languages, such as India, must be taken into account if voice-based language identification is ever going to be practical. The challenge is in finding a technique to clearly and effectively identify the features that may differentiate across languages. The model processes audio data, creating spectrogram images from them before extracting features. Then, the Deep Learning (DL) is employed to streamline the output identification process by emphasizing the most crucial characteristics and attributes. Realizing that a person's vocal signal may be understood or observed was a major inspiration for the concept. This research work employ spectrograms (for visual data) and deep learning techniques to categorize Indic languages inside the IIITH Indic voice database. Finally, a model-based comparative analysis has been conducted by analyzing the accuracy, precision, recall, and f1-score to show that the proposed approach is more robust than existing models.","PeriodicalId":356949,"journal":{"name":"2022 6th International Conference on Electronics, Communication and Aerospace Technology","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 6th International Conference on Electronics, Communication and Aerospace Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICECA55336.2022.10009380","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Unlike in some other parts of the world, speech recognition technology is legal in the West. It's not to the same degree that this happens in East Asian countries. It's possible that linguistic barriers are a major cause of this chasm. In addition, countries with many languages, such as India, must be taken into account if voice-based language identification is ever going to be practical. The challenge is in finding a technique to clearly and effectively identify the features that may differentiate across languages. The model processes audio data, creating spectrogram images from them before extracting features. Then, the Deep Learning (DL) is employed to streamline the output identification process by emphasizing the most crucial characteristics and attributes. Realizing that a person's vocal signal may be understood or observed was a major inspiration for the concept. This research work employ spectrograms (for visual data) and deep learning techniques to categorize Indic languages inside the IIITH Indic voice database. Finally, a model-based comparative analysis has been conducted by analyzing the accuracy, precision, recall, and f1-score to show that the proposed approach is more robust than existing models.

查看原文本刊更多论文

基于ML和DL的印度口语分类综述

与世界上其他一些地区不同，语音识别技术在西方是合法的。这种情况在东亚国家发生的程度不同。语言障碍可能是造成这种鸿沟的主要原因。此外，如果基于语音的语言识别变得实用，必须考虑到拥有多种语言的国家，如印度。挑战在于找到一种技术来清晰有效地识别可能区分不同语言的特征。该模型处理音频数据，在提取特征之前从中创建频谱图图像。然后，通过强调最关键的特征和属性，采用深度学习(DL)来简化输出识别过程。意识到一个人的声音信号可以被理解或观察是这个概念的主要灵感。本研究工作采用频谱图(用于视觉数据)和深度学习技术对IIITH印度语音数据库中的印度语言进行分类。最后，通过对准确率、精密度、召回率和f1-score的分析，进行了基于模型的对比分析，表明本文方法比现有模型具有更强的鲁棒性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 6th International Conference on Electronics, Communication and Aerospace Technology

自引率

0.00%

发文量