一个语言识别系统

2017 International Conference on Signal Processing and Communication (ICSPC) Pub Date : 2017-07-01 DOI:10.1109/CSPC.2017.8305857

Himadri Mukherjee, Ankita Dhar, S. Phadikar, K. Roy

{"title":"一个语言识别系统","authors":"Himadri Mukherjee, Ankita Dhar, S. Phadikar, K. Roy","doi":"10.1109/CSPC.2017.8305857","DOIUrl":null,"url":null,"abstract":"Since the inception of IT, one of the primary concerns has been to build devices with easy interactivity. Speech can be considered as one of the most preferred and easiest modes of interaction. Speech Recognition is the technique of automatically identifying spoken words from voice signals. Due to the multilingual nature of our country, we are habituated in using a mixture of languages in the course of verbal interaction and so, prior to recognizing speech, it is essential to determine the respective languages to which the spoken words belong. RECAL (Record Extract Classify According to Language) is a system, aimed towards identification of languages from multilingual voice signals. To start with, Mel Scale Cepstral Coefficient (MFCC) based features have been used to model languages using 9300 uttered numerals amidst 3 languages (English, Bangla and Hindi). An accuracy of 98.39% has been obtained considering the similarity between Bangla and Hindi numerals and avoidance of noise gating to simulate real world environment.","PeriodicalId":123773,"journal":{"name":"2017 International Conference on Signal Processing and Communication (ICSPC)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"RECAL — A language identification system\",\"authors\":\"Himadri Mukherjee, Ankita Dhar, S. Phadikar, K. Roy\",\"doi\":\"10.1109/CSPC.2017.8305857\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Since the inception of IT, one of the primary concerns has been to build devices with easy interactivity. Speech can be considered as one of the most preferred and easiest modes of interaction. Speech Recognition is the technique of automatically identifying spoken words from voice signals. Due to the multilingual nature of our country, we are habituated in using a mixture of languages in the course of verbal interaction and so, prior to recognizing speech, it is essential to determine the respective languages to which the spoken words belong. RECAL (Record Extract Classify According to Language) is a system, aimed towards identification of languages from multilingual voice signals. To start with, Mel Scale Cepstral Coefficient (MFCC) based features have been used to model languages using 9300 uttered numerals amidst 3 languages (English, Bangla and Hindi). An accuracy of 98.39% has been obtained considering the similarity between Bangla and Hindi numerals and avoidance of noise gating to simulate real world environment.\",\"PeriodicalId\":123773,\"journal\":{\"name\":\"2017 International Conference on Signal Processing and Communication (ICSPC)\",\"volume\":\"50 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Conference on Signal Processing and Communication (ICSPC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CSPC.2017.8305857\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Signal Processing and Communication (ICSPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSPC.2017.8305857","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 14

摘要

自IT诞生以来，主要关注的问题之一就是构建具有易于交互性的设备。语音可以被认为是最受欢迎和最简单的交互方式之一。语音识别是一种从语音信号中自动识别口语单词的技术。由于我国的多语言性质，我们习惯于在口头互动过程中使用多种语言，因此，在识别语音之前，确定口语单词所属的各自语言是至关重要的。RECAL (Record Extract classified按语言分类)是一个旨在从多语言语音信号中识别语言的系统。首先，基于Mel Scale Cepstral Coefficient (MFCC)的特征被用于在3种语言(英语、孟加拉语和印地语)中使用9300个发音数字来建模语言。考虑到孟加拉语和印地语数字之间的相似性以及避免噪声门控以模拟现实世界环境，获得了98.39%的准确率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

RECAL — A language identification system

Since the inception of IT, one of the primary concerns has been to build devices with easy interactivity. Speech can be considered as one of the most preferred and easiest modes of interaction. Speech Recognition is the technique of automatically identifying spoken words from voice signals. Due to the multilingual nature of our country, we are habituated in using a mixture of languages in the course of verbal interaction and so, prior to recognizing speech, it is essential to determine the respective languages to which the spoken words belong. RECAL (Record Extract Classify According to Language) is a system, aimed towards identification of languages from multilingual voice signals. To start with, Mel Scale Cepstral Coefficient (MFCC) based features have been used to model languages using 9300 uttered numerals amidst 3 languages (English, Bangla and Hindi). An accuracy of 98.39% has been obtained considering the similarity between Bangla and Hindi numerals and avoidance of noise gating to simulate real world environment.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 International Conference on Signal Processing and Communication (ICSPC)

自引率

0.00%

发文量