基于深度学习网络的母语语音鲁棒性听障说话人识别

Int. Arab J. Inf. Technol. Pub Date : 2023-01-01 DOI:10.34028/iajit/20/1/11

Jeyalakshmi Chelliah, KiranBala Benny, Revathi Arunachalam, Viswanathan Balasubramanian

{"title":"基于深度学习网络的母语语音鲁棒性听障说话人识别","authors":"Jeyalakshmi Chelliah, KiranBala Benny, Revathi Arunachalam, Viswanathan Balasubramanian","doi":"10.34028/iajit/20/1/11","DOIUrl":null,"url":null,"abstract":"Several research works in speaker recognition have grown recently due to its tremendous applications in security, criminal investigations and in other major fields. Identification of a speaker is represented by the way they speak, and not on the spoken words. Hence the identification of hearing-impaired speakers from their speech is a challenging task since their speech is highly distorted. In this paper, a new task has been introduced in recognizing Hearing Impaired (HI) speakers using speech as a biometric in native language Tamil. Though their speech is very hard to get recognized even by their parents and teachers, our proposed system accurately identifies them by adapting enhancement of their speeches. Due to the huge variety in their utterances, instead of applying the spectrogram of raw speech, Mel Frequency Cepstral Coefficient features are derived from speech and it is applied as spectrogram to Convolutional Neural Network (CNN), which is not necessary for ordinary speakers. In the proposed system of recognizing HI speakers, is used as a modelling technique to assess the performance of the system and this deep learning network provides 80% accuracy and the system is less complex. Auto Associative Neural Network (AANN) is used as a modelling technique and performance of AANN is only 9% accurate and it is found that CNN performs better than AANN for recognizing HI speakers. Hence this system is very much useful for the biometric system and other security related applications for hearing impaired speakers.","PeriodicalId":13624,"journal":{"name":"Int. Arab J. Inf. Technol.","volume":"22 1","pages":"102-112"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Robust Hearing-Impaired Speaker Recognition from Speech using Deep Learning Networks in Native Language\",\"authors\":\"Jeyalakshmi Chelliah, KiranBala Benny, Revathi Arunachalam, Viswanathan Balasubramanian\",\"doi\":\"10.34028/iajit/20/1/11\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Several research works in speaker recognition have grown recently due to its tremendous applications in security, criminal investigations and in other major fields. Identification of a speaker is represented by the way they speak, and not on the spoken words. Hence the identification of hearing-impaired speakers from their speech is a challenging task since their speech is highly distorted. In this paper, a new task has been introduced in recognizing Hearing Impaired (HI) speakers using speech as a biometric in native language Tamil. Though their speech is very hard to get recognized even by their parents and teachers, our proposed system accurately identifies them by adapting enhancement of their speeches. Due to the huge variety in their utterances, instead of applying the spectrogram of raw speech, Mel Frequency Cepstral Coefficient features are derived from speech and it is applied as spectrogram to Convolutional Neural Network (CNN), which is not necessary for ordinary speakers. In the proposed system of recognizing HI speakers, is used as a modelling technique to assess the performance of the system and this deep learning network provides 80% accuracy and the system is less complex. Auto Associative Neural Network (AANN) is used as a modelling technique and performance of AANN is only 9% accurate and it is found that CNN performs better than AANN for recognizing HI speakers. Hence this system is very much useful for the biometric system and other security related applications for hearing impaired speakers.\",\"PeriodicalId\":13624,\"journal\":{\"name\":\"Int. Arab J. Inf. Technol.\",\"volume\":\"22 1\",\"pages\":\"102-112\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. Arab J. Inf. Technol.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.34028/iajit/20/1/11\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. Arab J. Inf. Technol.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.34028/iajit/20/1/11","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

近年来，由于语音识别在安全、刑事侦查等重要领域的广泛应用，语音识别领域的研究日益增多。说话人的身份是通过他们说话的方式来体现的，而不是通过他们说的话。因此，从他们的言语中识别听障人士是一项具有挑战性的任务，因为他们的言语是高度扭曲的。本文介绍了一种利用泰米尔语语音作为生物特征识别听障人士的新方法。虽然他们的语言很难被父母和老师识别，但我们的系统通过对他们的语言进行适应性增强来准确识别他们。由于他们的话语种类繁多，所以不使用原始语音的频谱图，而是从语音中导出Mel频率倒谱系数特征，并将其作为频谱图应用到卷积神经网络(CNN)中，这对于普通说话人来说是不必要的。在提出的识别HI扬声器的系统中，使用深度学习网络作为建模技术来评估系统的性能，该深度学习网络提供了80%的准确率，并且系统不那么复杂。使用自动关联神经网络(Auto Associative Neural Network, AANN)作为建模技术，AANN的准确率仅为9%，并且发现CNN在识别HI说话者方面的表现优于AANN。因此，该系统对听力受损说话者的生物识别系统和其他安全相关应用非常有用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Robust Hearing-Impaired Speaker Recognition from Speech using Deep Learning Networks in Native Language

Several research works in speaker recognition have grown recently due to its tremendous applications in security, criminal investigations and in other major fields. Identification of a speaker is represented by the way they speak, and not on the spoken words. Hence the identification of hearing-impaired speakers from their speech is a challenging task since their speech is highly distorted. In this paper, a new task has been introduced in recognizing Hearing Impaired (HI) speakers using speech as a biometric in native language Tamil. Though their speech is very hard to get recognized even by their parents and teachers, our proposed system accurately identifies them by adapting enhancement of their speeches. Due to the huge variety in their utterances, instead of applying the spectrogram of raw speech, Mel Frequency Cepstral Coefficient features are derived from speech and it is applied as spectrogram to Convolutional Neural Network (CNN), which is not necessary for ordinary speakers. In the proposed system of recognizing HI speakers, is used as a modelling technique to assess the performance of the system and this deep learning network provides 80% accuracy and the system is less complex. Auto Associative Neural Network (AANN) is used as a modelling technique and performance of AANN is only 9% accurate and it is found that CNN performs better than AANN for recognizing HI speakers. Hence this system is very much useful for the biometric system and other security related applications for hearing impaired speakers.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Int. Arab J. Inf. Technol.

自引率

0.00%

发文量