考虑到掩码空间释放的双耳波束成形

IF 5.1 2区计算机科学 Q1 ACOUSTICS

IEEE/ACM Transactions on Audio, Speech, and Language Processing Pub Date : 2024-08-29 DOI:10.1109/TASLP.2024.3451988

Johannes W. de Vries;Steven van de Par;Geert Leus;Richard Heusdens;Richard C. Hendriks

{"title":"考虑到掩码空间释放的双耳波束成形","authors":"Johannes W. de Vries;Steven van de Par;Geert Leus;Richard Heusdens;Richard C. Hendriks","doi":"10.1109/TASLP.2024.3451988","DOIUrl":null,"url":null,"abstract":"Hearing impairment is a prevalent problem with daily challenges like impaired speech intelligibility and sound localisation. One of the shortcomings of spatial filtering in hearing aids is that speech intelligibility is often not optimised directly, meaning that different auditory processes contributing to intelligibility are often not considered. One example is the perceptual phenomenon known as spatial release from masking (SRM). This paper develops a signal model that explicitly considers SRM in the beamforming design, achieved by transforming the binaural intelligibility prediction model (BSIM) into a signal processing framework. The resulting extended signal model is used to analyse the performance of reference beamformers and design a novel beamformer that more closely considers how the auditory system perceives binaural sound. It can be shown that the binaural minimum variance distortionless response (BMVDR) beamformer is also an optimal solution for the extended, perceived model, suggesting that SRM does not play a significant role in intelligibility enhancement after optimal beamforming. However, the optimal beamformer is no longer unique in the extended signal model. The additional secondary degrees of freedom can be used to preserve binaural cues of interfering sources while still achieving the same perceived performance of the BMVDR beamformer, though with a possible high sensitivity to intelligibility model mismatch errors.","PeriodicalId":13332,"journal":{"name":"IEEE/ACM Transactions on Audio, Speech, and Language Processing","volume":"32 ","pages":"4002-4012"},"PeriodicalIF":5.1000,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Binaural Beamforming Taking Into Account Spatial Release From Masking\",\"authors\":\"Johannes W. de Vries;Steven van de Par;Geert Leus;Richard Heusdens;Richard C. Hendriks\",\"doi\":\"10.1109/TASLP.2024.3451988\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Hearing impairment is a prevalent problem with daily challenges like impaired speech intelligibility and sound localisation. One of the shortcomings of spatial filtering in hearing aids is that speech intelligibility is often not optimised directly, meaning that different auditory processes contributing to intelligibility are often not considered. One example is the perceptual phenomenon known as spatial release from masking (SRM). This paper develops a signal model that explicitly considers SRM in the beamforming design, achieved by transforming the binaural intelligibility prediction model (BSIM) into a signal processing framework. The resulting extended signal model is used to analyse the performance of reference beamformers and design a novel beamformer that more closely considers how the auditory system perceives binaural sound. It can be shown that the binaural minimum variance distortionless response (BMVDR) beamformer is also an optimal solution for the extended, perceived model, suggesting that SRM does not play a significant role in intelligibility enhancement after optimal beamforming. However, the optimal beamformer is no longer unique in the extended signal model. The additional secondary degrees of freedom can be used to preserve binaural cues of interfering sources while still achieving the same perceived performance of the BMVDR beamformer, though with a possible high sensitivity to intelligibility model mismatch errors.\",\"PeriodicalId\":13332,\"journal\":{\"name\":\"IEEE/ACM Transactions on Audio, Speech, and Language Processing\",\"volume\":\"32 \",\"pages\":\"4002-4012\"},\"PeriodicalIF\":5.1000,\"publicationDate\":\"2024-08-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE/ACM Transactions on Audio, Speech, and Language Processing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10659165/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ACOUSTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE/ACM Transactions on Audio, Speech, and Language Processing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10659165/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ACOUSTICS","Score":null,"Total":0}

引用次数: 0

摘要

听力障碍是一个普遍存在的问题，其日常挑战包括语音清晰度受损和声音定位。助听器空间滤波技术的缺点之一是往往不能直接优化言语清晰度，这意味着往往没有考虑到不同的听觉过程对清晰度的影响。其中一个例子就是被称为 "掩蔽空间释放"（SRM）的知觉现象。本文通过将双耳可懂度预测模型（BSIM）转化为信号处理框架，开发了一种信号模型，在波束成形设计中明确考虑了 SRM。由此产生的扩展信号模型被用于分析参考波束成形器的性能，并设计出一种新型波束成形器，更贴近地考虑听觉系统如何感知双耳声音。结果表明，双耳最小方差无失真响应（BMVDR）波束成形器也是扩展的感知模型的最佳解决方案，这表明在最佳波束成形之后，SRM 在可懂度增强方面并没有发挥重要作用。然而，在扩展信号模型中，最佳波束成形器不再是唯一的。额外的二级自由度可用于保留干扰源的双耳线索，同时仍能达到与 BMVDR 波束成形器相同的感知性能，但对可懂度模型不匹配误差的敏感度可能较高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Binaural Beamforming Taking Into Account Spatial Release From Masking

Hearing impairment is a prevalent problem with daily challenges like impaired speech intelligibility and sound localisation. One of the shortcomings of spatial filtering in hearing aids is that speech intelligibility is often not optimised directly, meaning that different auditory processes contributing to intelligibility are often not considered. One example is the perceptual phenomenon known as spatial release from masking (SRM). This paper develops a signal model that explicitly considers SRM in the beamforming design, achieved by transforming the binaural intelligibility prediction model (BSIM) into a signal processing framework. The resulting extended signal model is used to analyse the performance of reference beamformers and design a novel beamformer that more closely considers how the auditory system perceives binaural sound. It can be shown that the binaural minimum variance distortionless response (BMVDR) beamformer is also an optimal solution for the extended, perceived model, suggesting that SRM does not play a significant role in intelligibility enhancement after optimal beamforming. However, the optimal beamformer is no longer unique in the extended signal model. The additional secondary degrees of freedom can be used to preserve binaural cues of interfering sources while still achieving the same perceived performance of the BMVDR beamformer, though with a possible high sensitivity to intelligibility model mismatch errors.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE/ACM Transactions on Audio, Speech, and Language Processing ACOUSTICS-ENGINEERING, ELECTRICAL & ELECTRONIC

CiteScore

11.30

自引率

11.10%

发文量

217

期刊介绍： The IEEE/ACM Transactions on Audio, Speech, and Language Processing covers audio, speech and language processing and the sciences that support them. In audio processing: transducers, room acoustics, active sound control, human audition, analysis/synthesis/coding of music, and consumer audio. In speech processing: areas such as speech analysis, synthesis, coding, speech and speaker recognition, speech production and perception, and speech enhancement. In language processing: speech and text analysis, understanding, generation, dialog management, translation, summarization, question answering and document indexing and retrieval, as well as general language modeling.