子空间和频域语音增强技术

2021 International Conference on Recent Trends on Electronics, Information, Communication & Technology (RTEICT) Pub Date : 2021-08-27 DOI:10.1109/RTEICT52294.2021.9573833

Ragipati Naga Sai Tejaswini, Ravikumar Kandagatla, Jahnavi Nandeti, Mamidi Krupakar, Paragati Haveela

{"title":"子空间和频域语音增强技术","authors":"Ragipati Naga Sai Tejaswini, Ravikumar Kandagatla, Jahnavi Nandeti, Mamidi Krupakar, Paragati Haveela","doi":"10.1109/RTEICT52294.2021.9573833","DOIUrl":null,"url":null,"abstract":"Speech enhancement or noise reduction is used as front end processing for speech recognition application. Speech enhancement applications include mobile phones, hand free phones, hearing aids, personal assistants, home automation, robots and so on. Also the hearing aid plays important role for hearing impaired listeners for comfort listening. To understand the speech enhancement algorithms it is important to analyze the output/performance by varying the parameters involved in the technique / algorithm. The main objective of paper is to compare different frequency domain approaches and time domain approaches available for speech enhancement. Karhunen-Loeve transform (KLT) and the MMSE estimators for speech enhancement is discussed. It is observed that considering perceptually motivated techniques shows improved performance and thus results are compared for basic approach and perceptual motivated approaches. This work discusses the theory related to speech enhancement and gives the guidance on how to proceed for implementation of speech enhancement algorithms using MATLAB. The real time application of mathematical operations like Fourier transform, Averaging, variance, Minimum Mean Square and windowing is discussed. Sub space algorithms for speech enhancement are discussed and the performance is compared with frequency domain approaches. Simulations are performed using MATLAB and the performance is compared using objective performance measures Signal to Noise Ratio (SNR), Segmental SNR and PESQ.","PeriodicalId":191410,"journal":{"name":"2021 International Conference on Recent Trends on Electronics, Information, Communication & Technology (RTEICT)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Subspace and Frequency Domain Speech Enhancement Techniques\",\"authors\":\"Ragipati Naga Sai Tejaswini, Ravikumar Kandagatla, Jahnavi Nandeti, Mamidi Krupakar, Paragati Haveela\",\"doi\":\"10.1109/RTEICT52294.2021.9573833\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Speech enhancement or noise reduction is used as front end processing for speech recognition application. Speech enhancement applications include mobile phones, hand free phones, hearing aids, personal assistants, home automation, robots and so on. Also the hearing aid plays important role for hearing impaired listeners for comfort listening. To understand the speech enhancement algorithms it is important to analyze the output/performance by varying the parameters involved in the technique / algorithm. The main objective of paper is to compare different frequency domain approaches and time domain approaches available for speech enhancement. Karhunen-Loeve transform (KLT) and the MMSE estimators for speech enhancement is discussed. It is observed that considering perceptually motivated techniques shows improved performance and thus results are compared for basic approach and perceptual motivated approaches. This work discusses the theory related to speech enhancement and gives the guidance on how to proceed for implementation of speech enhancement algorithms using MATLAB. The real time application of mathematical operations like Fourier transform, Averaging, variance, Minimum Mean Square and windowing is discussed. Sub space algorithms for speech enhancement are discussed and the performance is compared with frequency domain approaches. Simulations are performed using MATLAB and the performance is compared using objective performance measures Signal to Noise Ratio (SNR), Segmental SNR and PESQ.\",\"PeriodicalId\":191410,\"journal\":{\"name\":\"2021 International Conference on Recent Trends on Electronics, Information, Communication & Technology (RTEICT)\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-08-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference on Recent Trends on Electronics, Information, Communication & Technology (RTEICT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/RTEICT52294.2021.9573833\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Recent Trends on Electronics, Information, Communication & Technology (RTEICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RTEICT52294.2021.9573833","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

语音识别应用的前端处理采用语音增强或降噪。语音增强应用包括手机、免提电话、助听器、个人助理、家庭自动化、机器人等。助听器对听力受损者的舒适聆听也起着重要的作用。为了理解语音增强算法，通过改变技术/算法中涉及的参数来分析输出/性能是很重要的。本文的主要目的是比较可用于语音增强的不同频域方法和时域方法。讨论了用于语音增强的Karhunen-Loeve变换(KLT)和MMSE估计。观察到，考虑感知驱动的技术表现出更好的性能，因此比较了基本方法和感知驱动方法的结果。本文讨论了语音增强的相关理论，并给出了如何使用MATLAB实现语音增强算法的指导。讨论了傅里叶变换、平均、方差、最小均方和开窗等数学运算的实时应用。讨论了语音增强的子空间算法，并与频域算法进行了性能比较。利用MATLAB进行了仿真，并使用客观性能指标信噪比(SNR)、段信噪比和PESQ对性能进行了比较。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Subspace and Frequency Domain Speech Enhancement Techniques

Speech enhancement or noise reduction is used as front end processing for speech recognition application. Speech enhancement applications include mobile phones, hand free phones, hearing aids, personal assistants, home automation, robots and so on. Also the hearing aid plays important role for hearing impaired listeners for comfort listening. To understand the speech enhancement algorithms it is important to analyze the output/performance by varying the parameters involved in the technique / algorithm. The main objective of paper is to compare different frequency domain approaches and time domain approaches available for speech enhancement. Karhunen-Loeve transform (KLT) and the MMSE estimators for speech enhancement is discussed. It is observed that considering perceptually motivated techniques shows improved performance and thus results are compared for basic approach and perceptual motivated approaches. This work discusses the theory related to speech enhancement and gives the guidance on how to proceed for implementation of speech enhancement algorithms using MATLAB. The real time application of mathematical operations like Fourier transform, Averaging, variance, Minimum Mean Square and windowing is discussed. Sub space algorithms for speech enhancement are discussed and the performance is compared with frequency domain approaches. Simulations are performed using MATLAB and the performance is compared using objective performance measures Signal to Noise Ratio (SNR), Segmental SNR and PESQ.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 International Conference on Recent Trends on Electronics, Information, Communication & Technology (RTEICT)

自引率

0.00%

发文量