Implication of speech level control in noise to sound quality judgement

2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2018-11-01 DOI:10.23919/APSIPA.2018.8659672

Sara Akbarzadeh, Sungmin Lee, Satnam Singh, Chin-Tuan Tan

{"title":"Implication of speech level control in noise to sound quality judgement","authors":"Sara Akbarzadeh, Sungmin Lee, Satnam Singh, Chin-Tuan Tan","doi":"10.23919/APSIPA.2018.8659672","DOIUrl":null,"url":null,"abstract":"Relative levels of speech and noise, which is signal-to-noise ratio (SNR), alone as a metric may not fully account how human perceives speech in noise or making judgement on the sound quality of the speech component. To date, the most common rationale in front-end processing of noisy speech in assistive hearing devices is to reduce “noise” (estimated) with a sole objective to improve the overall SNR. Absolute sound pressure level of speech in the remaining noise, which is necessary for listeners to anchor their perceptual judgement, is assumed to be restored by the subsequent dynamic range compression stage intended to compensate for the loudness recruitment in hearing impaired (HI). However, un-coordinated setting of thresholds that trigger the nonlinear processing in these two separate stages, amplify the remaining “noise” and/or distortion instead. This will confuse listener's judgement of sound quality and deviate from the usual perceptual trend as one would expect when more noise was present. In this study, both normal hearing (NH) and HI listeners were asked to rate the sound quality of noisy speech and noise reduced speech as they perceived. The result found that speech processed by noise reduction algorithms were lower in quality compared to original unprocessed speech in noise conditions. The outcomes also showed that sound quality judgement was dependent on both input SNR and absolute level of speech, with a greater weightage on the latter, across both NH and HI listeners. The outcome of this study potentially suggests that integrating the two separate processing stages into one will better match with the underlying mechanism in auditory reception of sound. Further work will attempt to identify settings of these two processing stages for a better speech reception in assistive hearing device users.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"163 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/APSIPA.2018.8659672","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Relative levels of speech and noise, which is signal-to-noise ratio (SNR), alone as a metric may not fully account how human perceives speech in noise or making judgement on the sound quality of the speech component. To date, the most common rationale in front-end processing of noisy speech in assistive hearing devices is to reduce “noise” (estimated) with a sole objective to improve the overall SNR. Absolute sound pressure level of speech in the remaining noise, which is necessary for listeners to anchor their perceptual judgement, is assumed to be restored by the subsequent dynamic range compression stage intended to compensate for the loudness recruitment in hearing impaired (HI). However, un-coordinated setting of thresholds that trigger the nonlinear processing in these two separate stages, amplify the remaining “noise” and/or distortion instead. This will confuse listener's judgement of sound quality and deviate from the usual perceptual trend as one would expect when more noise was present. In this study, both normal hearing (NH) and HI listeners were asked to rate the sound quality of noisy speech and noise reduced speech as they perceived. The result found that speech processed by noise reduction algorithms were lower in quality compared to original unprocessed speech in noise conditions. The outcomes also showed that sound quality judgement was dependent on both input SNR and absolute level of speech, with a greater weightage on the latter, across both NH and HI listeners. The outcome of this study potentially suggests that integrating the two separate processing stages into one will better match with the underlying mechanism in auditory reception of sound. Further work will attempt to identify settings of these two processing stages for a better speech reception in assistive hearing device users.

查看原文本刊更多论文

噪声中语音电平控制对音质判断的意义

语音和噪声的相对水平，即信噪比(SNR)，单独作为度量可能不能完全说明人类如何在噪声中感知语音或对语音成分的音质做出判断。迄今为止，在辅助听力设备中对有噪声语音进行前端处理时，最常见的基本原理是减少“噪声”(估计)，其唯一目的是提高整体信噪比。在剩余噪声中，语音的绝对声压级是听者固定感知判断所必需的，它可以通过随后的动态范围压缩阶段恢复，以补偿听障(HI)的响度补充。然而，在这两个独立的阶段触发非线性处理的阈值的不协调设置，反而放大了剩余的“噪声”和/或失真。这将混淆听者对音质的判断，并偏离通常的感知趋势，因为人们期望更多的噪音存在。在这项研究中，正常听力(NH)和高听力(HI)听众都被要求对他们所感知到的嘈杂语音和降噪语音的音质进行评分。结果发现，在噪声条件下，经过降噪算法处理的语音质量低于原始未处理的语音。结果还表明，音质判断依赖于输入信噪比和绝对语音水平，后者在NH和HI听众中占有更大的权重。本研究的结果可能表明，将两个独立的加工阶段整合为一个阶段将更好地匹配听觉接收的潜在机制。进一步的工作将试图确定这两个处理阶段的设置，以使辅助听力设备用户更好地接受语音。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)

自引率

0.00%

发文量