子空间和频域语音增强技术

Ragipati Naga Sai Tejaswini, Ravikumar Kandagatla, Jahnavi Nandeti, Mamidi Krupakar, Paragati Haveela
{"title":"子空间和频域语音增强技术","authors":"Ragipati Naga Sai Tejaswini, Ravikumar Kandagatla, Jahnavi Nandeti, Mamidi Krupakar, Paragati Haveela","doi":"10.1109/RTEICT52294.2021.9573833","DOIUrl":null,"url":null,"abstract":"Speech enhancement or noise reduction is used as front end processing for speech recognition application. Speech enhancement applications include mobile phones, hand free phones, hearing aids, personal assistants, home automation, robots and so on. Also the hearing aid plays important role for hearing impaired listeners for comfort listening. To understand the speech enhancement algorithms it is important to analyze the output/performance by varying the parameters involved in the technique / algorithm. The main objective of paper is to compare different frequency domain approaches and time domain approaches available for speech enhancement. Karhunen-Loeve transform (KLT) and the MMSE estimators for speech enhancement is discussed. It is observed that considering perceptually motivated techniques shows improved performance and thus results are compared for basic approach and perceptual motivated approaches. This work discusses the theory related to speech enhancement and gives the guidance on how to proceed for implementation of speech enhancement algorithms using MATLAB. The real time application of mathematical operations like Fourier transform, Averaging, variance, Minimum Mean Square and windowing is discussed. Sub space algorithms for speech enhancement are discussed and the performance is compared with frequency domain approaches. Simulations are performed using MATLAB and the performance is compared using objective performance measures Signal to Noise Ratio (SNR), Segmental SNR and PESQ.","PeriodicalId":191410,"journal":{"name":"2021 International Conference on Recent Trends on Electronics, Information, Communication & Technology (RTEICT)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Subspace and Frequency Domain Speech Enhancement Techniques\",\"authors\":\"Ragipati Naga Sai Tejaswini, Ravikumar Kandagatla, Jahnavi Nandeti, Mamidi Krupakar, Paragati Haveela\",\"doi\":\"10.1109/RTEICT52294.2021.9573833\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Speech enhancement or noise reduction is used as front end processing for speech recognition application. Speech enhancement applications include mobile phones, hand free phones, hearing aids, personal assistants, home automation, robots and so on. Also the hearing aid plays important role for hearing impaired listeners for comfort listening. To understand the speech enhancement algorithms it is important to analyze the output/performance by varying the parameters involved in the technique / algorithm. The main objective of paper is to compare different frequency domain approaches and time domain approaches available for speech enhancement. Karhunen-Loeve transform (KLT) and the MMSE estimators for speech enhancement is discussed. It is observed that considering perceptually motivated techniques shows improved performance and thus results are compared for basic approach and perceptual motivated approaches. This work discusses the theory related to speech enhancement and gives the guidance on how to proceed for implementation of speech enhancement algorithms using MATLAB. The real time application of mathematical operations like Fourier transform, Averaging, variance, Minimum Mean Square and windowing is discussed. Sub space algorithms for speech enhancement are discussed and the performance is compared with frequency domain approaches. Simulations are performed using MATLAB and the performance is compared using objective performance measures Signal to Noise Ratio (SNR), Segmental SNR and PESQ.\",\"PeriodicalId\":191410,\"journal\":{\"name\":\"2021 International Conference on Recent Trends on Electronics, Information, Communication & Technology (RTEICT)\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-08-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference on Recent Trends on Electronics, Information, Communication & Technology (RTEICT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/RTEICT52294.2021.9573833\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Recent Trends on Electronics, Information, Communication & Technology (RTEICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RTEICT52294.2021.9573833","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

语音识别应用的前端处理采用语音增强或降噪。语音增强应用包括手机、免提电话、助听器、个人助理、家庭自动化、机器人等。助听器对听力受损者的舒适聆听也起着重要的作用。为了理解语音增强算法,通过改变技术/算法中涉及的参数来分析输出/性能是很重要的。本文的主要目的是比较可用于语音增强的不同频域方法和时域方法。讨论了用于语音增强的Karhunen-Loeve变换(KLT)和MMSE估计。观察到,考虑感知驱动的技术表现出更好的性能,因此比较了基本方法和感知驱动方法的结果。本文讨论了语音增强的相关理论,并给出了如何使用MATLAB实现语音增强算法的指导。讨论了傅里叶变换、平均、方差、最小均方和开窗等数学运算的实时应用。讨论了语音增强的子空间算法,并与频域算法进行了性能比较。利用MATLAB进行了仿真,并使用客观性能指标信噪比(SNR)、段信噪比和PESQ对性能进行了比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Subspace and Frequency Domain Speech Enhancement Techniques
Speech enhancement or noise reduction is used as front end processing for speech recognition application. Speech enhancement applications include mobile phones, hand free phones, hearing aids, personal assistants, home automation, robots and so on. Also the hearing aid plays important role for hearing impaired listeners for comfort listening. To understand the speech enhancement algorithms it is important to analyze the output/performance by varying the parameters involved in the technique / algorithm. The main objective of paper is to compare different frequency domain approaches and time domain approaches available for speech enhancement. Karhunen-Loeve transform (KLT) and the MMSE estimators for speech enhancement is discussed. It is observed that considering perceptually motivated techniques shows improved performance and thus results are compared for basic approach and perceptual motivated approaches. This work discusses the theory related to speech enhancement and gives the guidance on how to proceed for implementation of speech enhancement algorithms using MATLAB. The real time application of mathematical operations like Fourier transform, Averaging, variance, Minimum Mean Square and windowing is discussed. Sub space algorithms for speech enhancement are discussed and the performance is compared with frequency domain approaches. Simulations are performed using MATLAB and the performance is compared using objective performance measures Signal to Noise Ratio (SNR), Segmental SNR and PESQ.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信