{"title":"提高了噪声环境下语言识别系统的识别率","authors":"Randheer Bagi, Jainath Yadav, K. S. Rao","doi":"10.1109/IC3.2015.7346681","DOIUrl":null,"url":null,"abstract":"Spoken language identification is a technique to model and classify the language, spoken by an unknown person. Language identification task is more challenging in environmental condition due to addition of different types of noise. Presence of noise in speech signal causes several nuisances. This paper covers several aspect of language identification in noisy environment. Experiments have been carried out using speaker independent Multilingual Indian Language Speech Corpus of Indian Institute of Technology, Kharagpur (IITKGP-MLILSC). In the proposed method, acoustic features are extracted from the raw speech signal. Gaussian Mixture Models (GMMs) are used to train the language models. To analyze the behavior of the identification system in a noisy environment, white noise is added into the clean speech corpus at different noise levels. Recognition rate of noisy speech was near about 14.84%. Significant performance degradation was observed compared to the clean speech. To overcome this adverse identification condition, reduction of noise is necessary. Spectral Subtraction (SS) and Minimum Mean Square Error (MMSE) are used to suppress the noise. The overall average recognition rate of the proposed system using clean speech is 56.48%. In case of enhanced speech using SS and MMSE, recognition rate is 35.91% and 35.53% respectively, which is significant improvement over the recognition rate of noisy speech (14.84%).","PeriodicalId":217950,"journal":{"name":"2015 Eighth International Conference on Contemporary Computing (IC3)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Improved recognition rate of language identification system in noisy environment\",\"authors\":\"Randheer Bagi, Jainath Yadav, K. S. Rao\",\"doi\":\"10.1109/IC3.2015.7346681\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Spoken language identification is a technique to model and classify the language, spoken by an unknown person. Language identification task is more challenging in environmental condition due to addition of different types of noise. Presence of noise in speech signal causes several nuisances. This paper covers several aspect of language identification in noisy environment. Experiments have been carried out using speaker independent Multilingual Indian Language Speech Corpus of Indian Institute of Technology, Kharagpur (IITKGP-MLILSC). In the proposed method, acoustic features are extracted from the raw speech signal. Gaussian Mixture Models (GMMs) are used to train the language models. To analyze the behavior of the identification system in a noisy environment, white noise is added into the clean speech corpus at different noise levels. Recognition rate of noisy speech was near about 14.84%. Significant performance degradation was observed compared to the clean speech. To overcome this adverse identification condition, reduction of noise is necessary. Spectral Subtraction (SS) and Minimum Mean Square Error (MMSE) are used to suppress the noise. The overall average recognition rate of the proposed system using clean speech is 56.48%. In case of enhanced speech using SS and MMSE, recognition rate is 35.91% and 35.53% respectively, which is significant improvement over the recognition rate of noisy speech (14.84%).\",\"PeriodicalId\":217950,\"journal\":{\"name\":\"2015 Eighth International Conference on Contemporary Computing (IC3)\",\"volume\":\"43 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-08-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 Eighth International Conference on Contemporary Computing (IC3)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IC3.2015.7346681\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 Eighth International Conference on Contemporary Computing (IC3)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC3.2015.7346681","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improved recognition rate of language identification system in noisy environment
Spoken language identification is a technique to model and classify the language, spoken by an unknown person. Language identification task is more challenging in environmental condition due to addition of different types of noise. Presence of noise in speech signal causes several nuisances. This paper covers several aspect of language identification in noisy environment. Experiments have been carried out using speaker independent Multilingual Indian Language Speech Corpus of Indian Institute of Technology, Kharagpur (IITKGP-MLILSC). In the proposed method, acoustic features are extracted from the raw speech signal. Gaussian Mixture Models (GMMs) are used to train the language models. To analyze the behavior of the identification system in a noisy environment, white noise is added into the clean speech corpus at different noise levels. Recognition rate of noisy speech was near about 14.84%. Significant performance degradation was observed compared to the clean speech. To overcome this adverse identification condition, reduction of noise is necessary. Spectral Subtraction (SS) and Minimum Mean Square Error (MMSE) are used to suppress the noise. The overall average recognition rate of the proposed system using clean speech is 56.48%. In case of enhanced speech using SS and MMSE, recognition rate is 35.91% and 35.53% respectively, which is significant improvement over the recognition rate of noisy speech (14.84%).