A Single-Input Hearing Aid Based on the Auditory Perceptual Features to Improve Speech Intelligibility in Noise

C. N. Canagarajah, P. Rayner
{"title":"A Single-Input Hearing Aid Based on the Auditory Perceptual Features to Improve Speech Intelligibility in Noise","authors":"C. N. Canagarajah, P. Rayner","doi":"10.1109/ASPAA.1991.634123","DOIUrl":null,"url":null,"abstract":"One of the main problems of the sensorineural hearing impaired listeners is the partial or complete loss of frequency selectivity. It is now well established that most of the auditory perceptual features are very well represented in the ear by the spectrum of the incoming sound signal. Thus the loss of frequency selectivity means that it is difficult for the impaired listener to discriminate between two sounds or to understand speech in a noisy environment. This handicap is referred to as the cocktail party eflect. It is widely accepted, and proven by the experiments carried out on impaired listeners, that one of the main causes for this impairment is the broad and tilted auditory filter shapes in the damaged cochlea compared to an undamaged normal ear. As a result, in noisy surroundings these broad filters allow more noise than a normal ear making detection of signal in noise difficult. Therefore to improve intelligibility a hearing aid must, not only suppress the noise in speech but also alleviate the problems of reduced frequency selectivity. There are a few hearing aids proposed in the literature to enhance speech in noise. Most of them are based on Adaptive noise cancellation or Adaptive beamforming principles. They have proved to be very useful in situations where there are few noise sources or when there is a reference noise available. Very often the environment contains many uncorrelated noise sources effectively creating a diffusive noise source. Hence obtaining a reference noise signal that is correlated with the noise in the other inputs is impossible. In these situations the above methods produce very little speech enhancement. There are many conventional single-input systems to suppress noise but like the multi-microphone methods mentioned above, they have proved to be of very little use in increasing the intelligibility of the speech for the hearing impaired. In this paper we illustrate how a single-input system incorporating the auditory perceptual features could be employed to increase intelligibility in hearing aids. Spectral Subtraction (SS) is an efficient way of reducing noise in single-input systems. In this method an estimate of the magnitude spectrum of the noise, #(U), is obtained during nonspeech activity and is subtracted from the magnitude spectrum of the noisy speech, X(w), to obtain the enhanced speech, S(u). This performs satisfactorily when the noise source is stationary. The main drawback of this system is it does not consider the problems of the hearing impaired and as a result is of very little benefit to them. Furthermore it introduces a residual or mwacal nobe in the processed speech. It is shown in this paper that by incorporating the perceptual features like masking and excitation patterns the above problems can be eliminated. The technique proposed here, firstly transforms the power (not magnitude) spectrum of the noisy speech (X(w)) into auditory excitation patterns, E(w). The auditory system consists of a bank of constant bandwidth band-pass filters on a logarithmic scale (Bark Scale). Excitation patterns are obtained by convolving the signal spectrum with these filters. E(w) now represents the power spectrum of the signal as it would appear if it had been processed by a normal ear. These excitation patterns enable the hearing impaired to group frequency components quite successfully, thereby increasing their frequency selectivity in spite of the broad filters in their auditory system. This transformation also removes unwanted noise and speech that might have been processed by the impaired ear.","PeriodicalId":146017,"journal":{"name":"Final Program and Paper Summaries 1991 IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"1991-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Final Program and Paper Summaries 1991 IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASPAA.1991.634123","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

One of the main problems of the sensorineural hearing impaired listeners is the partial or complete loss of frequency selectivity. It is now well established that most of the auditory perceptual features are very well represented in the ear by the spectrum of the incoming sound signal. Thus the loss of frequency selectivity means that it is difficult for the impaired listener to discriminate between two sounds or to understand speech in a noisy environment. This handicap is referred to as the cocktail party eflect. It is widely accepted, and proven by the experiments carried out on impaired listeners, that one of the main causes for this impairment is the broad and tilted auditory filter shapes in the damaged cochlea compared to an undamaged normal ear. As a result, in noisy surroundings these broad filters allow more noise than a normal ear making detection of signal in noise difficult. Therefore to improve intelligibility a hearing aid must, not only suppress the noise in speech but also alleviate the problems of reduced frequency selectivity. There are a few hearing aids proposed in the literature to enhance speech in noise. Most of them are based on Adaptive noise cancellation or Adaptive beamforming principles. They have proved to be very useful in situations where there are few noise sources or when there is a reference noise available. Very often the environment contains many uncorrelated noise sources effectively creating a diffusive noise source. Hence obtaining a reference noise signal that is correlated with the noise in the other inputs is impossible. In these situations the above methods produce very little speech enhancement. There are many conventional single-input systems to suppress noise but like the multi-microphone methods mentioned above, they have proved to be of very little use in increasing the intelligibility of the speech for the hearing impaired. In this paper we illustrate how a single-input system incorporating the auditory perceptual features could be employed to increase intelligibility in hearing aids. Spectral Subtraction (SS) is an efficient way of reducing noise in single-input systems. In this method an estimate of the magnitude spectrum of the noise, #(U), is obtained during nonspeech activity and is subtracted from the magnitude spectrum of the noisy speech, X(w), to obtain the enhanced speech, S(u). This performs satisfactorily when the noise source is stationary. The main drawback of this system is it does not consider the problems of the hearing impaired and as a result is of very little benefit to them. Furthermore it introduces a residual or mwacal nobe in the processed speech. It is shown in this paper that by incorporating the perceptual features like masking and excitation patterns the above problems can be eliminated. The technique proposed here, firstly transforms the power (not magnitude) spectrum of the noisy speech (X(w)) into auditory excitation patterns, E(w). The auditory system consists of a bank of constant bandwidth band-pass filters on a logarithmic scale (Bark Scale). Excitation patterns are obtained by convolving the signal spectrum with these filters. E(w) now represents the power spectrum of the signal as it would appear if it had been processed by a normal ear. These excitation patterns enable the hearing impaired to group frequency components quite successfully, thereby increasing their frequency selectivity in spite of the broad filters in their auditory system. This transformation also removes unwanted noise and speech that might have been processed by the impaired ear.
基于听觉感知特征的单输入助听器在噪声环境下提高语音清晰度
感音神经性听障听众的主要问题之一是部分或完全丧失频率选择性。现在已经确定,大多数听觉感知特征在耳朵中通过输入声音信号的频谱很好地表示出来。因此,频率选择性的丧失意味着受损的听者很难区分两种声音或在嘈杂的环境中理解讲话。这种缺陷被称为鸡尾酒会反射。人们普遍接受并通过对受损听者进行的实验证明,造成这种损害的主要原因之一是受损耳蜗与未受损的正常耳朵相比,听觉过滤器形状较宽且倾斜。因此,在嘈杂的环境中,这些宽滤波器比普通耳朵允许更多的噪声,使得在噪声中检测信号变得困难。因此,为了提高可听性,助听器不仅要抑制语音中的噪声,还要缓解频率选择性降低的问题。文献中提出了几种助听器来增强噪声环境下的语音。它们大多基于自适应噪声消除或自适应波束形成原理。事实证明,在噪声源很少或有参考噪声可用的情况下,它们非常有用。通常情况下,环境中包含许多不相关的噪声源,从而有效地形成扩散噪声源。因此,获得与其他输入噪声相关的参考噪声信号是不可能的。在这些情况下,上述方法产生很少的语音增强。有许多传统的单输入系统来抑制噪声,但就像上面提到的多麦克风方法一样,它们已被证明在提高听力受损者的语音清晰度方面用处不大。在本文中,我们说明了如何一个单输入系统结合听觉感知特征可以用来提高可理解性的助听器。谱减法(SS)是单输入系统中一种有效的降噪方法。在该方法中,在非言语活动期间获得噪声的幅度谱估计#(U),并从噪声语音的幅度谱X(w)中减去,得到增强的语音S(U)。当噪声源静止时,这种方法的效果令人满意。这个系统的主要缺点是它没有考虑到听障人士的问题,因此对他们几乎没有好处。在处理后的语音中引入残差或残差信号。本文表明,通过结合掩蔽和激励模式等感知特征,可以消除上述问题。本文提出的技术首先将噪声语音(X(w))的功率谱(而不是幅度)转换为听觉激发模式(E(w))。听觉系统由一组以对数尺度(巴克尺度)的恒定带宽带通滤波器组成。通过将信号频谱与这些滤波器进行卷积得到激励模式。E(w)现在表示信号的功率谱,就像它被正常的耳朵处理过一样。这些激发模式使听力受损的人能够相当成功地对频率成分进行分组,从而增加了他们的频率选择性,尽管他们的听觉系统中有广泛的过滤器。这种转换还消除了受损耳朵可能处理过的不需要的噪音和语音。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信