Improved recognition rate of language identification system in noisy environment

Randheer Bagi, Jainath Yadav, K. S. Rao
{"title":"Improved recognition rate of language identification system in noisy environment","authors":"Randheer Bagi, Jainath Yadav, K. S. Rao","doi":"10.1109/IC3.2015.7346681","DOIUrl":null,"url":null,"abstract":"Spoken language identification is a technique to model and classify the language, spoken by an unknown person. Language identification task is more challenging in environmental condition due to addition of different types of noise. Presence of noise in speech signal causes several nuisances. This paper covers several aspect of language identification in noisy environment. Experiments have been carried out using speaker independent Multilingual Indian Language Speech Corpus of Indian Institute of Technology, Kharagpur (IITKGP-MLILSC). In the proposed method, acoustic features are extracted from the raw speech signal. Gaussian Mixture Models (GMMs) are used to train the language models. To analyze the behavior of the identification system in a noisy environment, white noise is added into the clean speech corpus at different noise levels. Recognition rate of noisy speech was near about 14.84%. Significant performance degradation was observed compared to the clean speech. To overcome this adverse identification condition, reduction of noise is necessary. Spectral Subtraction (SS) and Minimum Mean Square Error (MMSE) are used to suppress the noise. The overall average recognition rate of the proposed system using clean speech is 56.48%. In case of enhanced speech using SS and MMSE, recognition rate is 35.91% and 35.53% respectively, which is significant improvement over the recognition rate of noisy speech (14.84%).","PeriodicalId":217950,"journal":{"name":"2015 Eighth International Conference on Contemporary Computing (IC3)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 Eighth International Conference on Contemporary Computing (IC3)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC3.2015.7346681","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Spoken language identification is a technique to model and classify the language, spoken by an unknown person. Language identification task is more challenging in environmental condition due to addition of different types of noise. Presence of noise in speech signal causes several nuisances. This paper covers several aspect of language identification in noisy environment. Experiments have been carried out using speaker independent Multilingual Indian Language Speech Corpus of Indian Institute of Technology, Kharagpur (IITKGP-MLILSC). In the proposed method, acoustic features are extracted from the raw speech signal. Gaussian Mixture Models (GMMs) are used to train the language models. To analyze the behavior of the identification system in a noisy environment, white noise is added into the clean speech corpus at different noise levels. Recognition rate of noisy speech was near about 14.84%. Significant performance degradation was observed compared to the clean speech. To overcome this adverse identification condition, reduction of noise is necessary. Spectral Subtraction (SS) and Minimum Mean Square Error (MMSE) are used to suppress the noise. The overall average recognition rate of the proposed system using clean speech is 56.48%. In case of enhanced speech using SS and MMSE, recognition rate is 35.91% and 35.53% respectively, which is significant improvement over the recognition rate of noisy speech (14.84%).
提高了噪声环境下语言识别系统的识别率
口语识别是一种对未知的人所说的语言进行建模和分类的技术。在环境条件下,由于不同类型的噪声的加入,使得语言识别任务更具挑战性。语音信号中噪声的存在会引起一些干扰。本文讨论了噪声环境下语言识别的几个方面。实验使用了印度理工学院(IITKGP-MLILSC)的独立于说话人的多语种印度语言语音语料库进行。在该方法中,从原始语音信号中提取声学特征。使用高斯混合模型(GMMs)来训练语言模型。为了分析识别系统在噪声环境下的行为,在不同噪声水平下的清洁语音语料库中加入白噪声。噪声语音的识别率接近14.84%。与干净的语音相比,观察到明显的性能下降。为了克服这种不利的识别条件,必须降低噪声。采用谱减法(SS)和最小均方误差(MMSE)抑制噪声。使用干净语音的系统总体平均识别率为56.48%。在使用SS和MMSE增强语音的情况下,识别率分别为35.91%和35.53%,比有噪声语音的识别率(14.84%)有显著提高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信