Johannes W. de Vries;Steven van de Par;Geert Leus;Richard Heusdens;Richard C. Hendriks
{"title":"考虑到掩码空间释放的双耳波束成形","authors":"Johannes W. de Vries;Steven van de Par;Geert Leus;Richard Heusdens;Richard C. Hendriks","doi":"10.1109/TASLP.2024.3451988","DOIUrl":null,"url":null,"abstract":"Hearing impairment is a prevalent problem with daily challenges like impaired speech intelligibility and sound localisation. One of the shortcomings of spatial filtering in hearing aids is that speech intelligibility is often not optimised directly, meaning that different auditory processes contributing to intelligibility are often not considered. One example is the perceptual phenomenon known as spatial release from masking (SRM). This paper develops a signal model that explicitly considers SRM in the beamforming design, achieved by transforming the binaural intelligibility prediction model (BSIM) into a signal processing framework. The resulting extended signal model is used to analyse the performance of reference beamformers and design a novel beamformer that more closely considers how the auditory system perceives binaural sound. It can be shown that the binaural minimum variance distortionless response (BMVDR) beamformer is also an optimal solution for the extended, perceived model, suggesting that SRM does not play a significant role in intelligibility enhancement after optimal beamforming. However, the optimal beamformer is no longer unique in the extended signal model. The additional secondary degrees of freedom can be used to preserve binaural cues of interfering sources while still achieving the same perceived performance of the BMVDR beamformer, though with a possible high sensitivity to intelligibility model mismatch errors.","PeriodicalId":13332,"journal":{"name":"IEEE/ACM Transactions on Audio, Speech, and Language Processing","volume":"32 ","pages":"4002-4012"},"PeriodicalIF":4.1000,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Binaural Beamforming Taking Into Account Spatial Release From Masking\",\"authors\":\"Johannes W. de Vries;Steven van de Par;Geert Leus;Richard Heusdens;Richard C. Hendriks\",\"doi\":\"10.1109/TASLP.2024.3451988\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Hearing impairment is a prevalent problem with daily challenges like impaired speech intelligibility and sound localisation. One of the shortcomings of spatial filtering in hearing aids is that speech intelligibility is often not optimised directly, meaning that different auditory processes contributing to intelligibility are often not considered. One example is the perceptual phenomenon known as spatial release from masking (SRM). This paper develops a signal model that explicitly considers SRM in the beamforming design, achieved by transforming the binaural intelligibility prediction model (BSIM) into a signal processing framework. The resulting extended signal model is used to analyse the performance of reference beamformers and design a novel beamformer that more closely considers how the auditory system perceives binaural sound. It can be shown that the binaural minimum variance distortionless response (BMVDR) beamformer is also an optimal solution for the extended, perceived model, suggesting that SRM does not play a significant role in intelligibility enhancement after optimal beamforming. However, the optimal beamformer is no longer unique in the extended signal model. The additional secondary degrees of freedom can be used to preserve binaural cues of interfering sources while still achieving the same perceived performance of the BMVDR beamformer, though with a possible high sensitivity to intelligibility model mismatch errors.\",\"PeriodicalId\":13332,\"journal\":{\"name\":\"IEEE/ACM Transactions on Audio, Speech, and Language Processing\",\"volume\":\"32 \",\"pages\":\"4002-4012\"},\"PeriodicalIF\":4.1000,\"publicationDate\":\"2024-08-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE/ACM Transactions on Audio, Speech, and Language Processing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10659165/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ACOUSTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE/ACM Transactions on Audio, Speech, and Language Processing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10659165/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ACOUSTICS","Score":null,"Total":0}
Binaural Beamforming Taking Into Account Spatial Release From Masking
Hearing impairment is a prevalent problem with daily challenges like impaired speech intelligibility and sound localisation. One of the shortcomings of spatial filtering in hearing aids is that speech intelligibility is often not optimised directly, meaning that different auditory processes contributing to intelligibility are often not considered. One example is the perceptual phenomenon known as spatial release from masking (SRM). This paper develops a signal model that explicitly considers SRM in the beamforming design, achieved by transforming the binaural intelligibility prediction model (BSIM) into a signal processing framework. The resulting extended signal model is used to analyse the performance of reference beamformers and design a novel beamformer that more closely considers how the auditory system perceives binaural sound. It can be shown that the binaural minimum variance distortionless response (BMVDR) beamformer is also an optimal solution for the extended, perceived model, suggesting that SRM does not play a significant role in intelligibility enhancement after optimal beamforming. However, the optimal beamformer is no longer unique in the extended signal model. The additional secondary degrees of freedom can be used to preserve binaural cues of interfering sources while still achieving the same perceived performance of the BMVDR beamformer, though with a possible high sensitivity to intelligibility model mismatch errors.
期刊介绍:
The IEEE/ACM Transactions on Audio, Speech, and Language Processing covers audio, speech and language processing and the sciences that support them. In audio processing: transducers, room acoustics, active sound control, human audition, analysis/synthesis/coding of music, and consumer audio. In speech processing: areas such as speech analysis, synthesis, coding, speech and speaker recognition, speech production and perception, and speech enhancement. In language processing: speech and text analysis, understanding, generation, dialog management, translation, summarization, question answering and document indexing and retrieval, as well as general language modeling.