Judging the Number and Gender of Talkers Present in an Auditory Scene Aided by Acoustic Beamforming.

IF 3 2区医学 Q1 AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY

Trends in Hearing Pub Date : 2025-01-01 Epub Date: 2025-05-29 DOI:10.1177/23312165251329791

Andrew J Byrne, Gerald Kidd

{"title":"Judging the Number and Gender of Talkers Present in an Auditory Scene Aided by Acoustic Beamforming.","authors":"Andrew J Byrne, Gerald Kidd","doi":"10.1177/23312165251329791","DOIUrl":null,"url":null,"abstract":"<p><p>The perceived numerosity of simultaneous, spatially separated speech sources was used to evaluate the effectiveness of triple beamformer processing, compared to that of both a single-channel beamformer and natural listening. Participants made judgments of the total number of talkers present in a simulated sound field and the gender composition of the talker group. The perceived numerosity was always underestimated for groups of more than three talkers. Performance with the triple beamformer was roughly equivalent to that of natural listening, including a beneficial effect of spatial separation of the sources in azimuth. The gender mix of the talker group also affected the numerosity judgments although the perceived gender ratio was generally accurate even when the total group count was underestimated. Time-reversing the speech resulted in lower numerosity judgements (increased error) under both natural and triple beamformer listening, suggesting an influence of linguistic processing on source numerosity judgments. Overall, factors that enhanced source segregation and speech stream coherence decreased errors in numerosity judgments. A stimulus-derived metric-the composite of glimpsed energy retained for all talkers in the sound field-was found to be a reasonably accurate predictor of the subjective numerosity judgments.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"29 ","pages":"23312165251329791"},"PeriodicalIF":3.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12123112/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Trends in Hearing","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/23312165251329791","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/5/29 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

The perceived numerosity of simultaneous, spatially separated speech sources was used to evaluate the effectiveness of triple beamformer processing, compared to that of both a single-channel beamformer and natural listening. Participants made judgments of the total number of talkers present in a simulated sound field and the gender composition of the talker group. The perceived numerosity was always underestimated for groups of more than three talkers. Performance with the triple beamformer was roughly equivalent to that of natural listening, including a beneficial effect of spatial separation of the sources in azimuth. The gender mix of the talker group also affected the numerosity judgments although the perceived gender ratio was generally accurate even when the total group count was underestimated. Time-reversing the speech resulted in lower numerosity judgements (increased error) under both natural and triple beamformer listening, suggesting an influence of linguistic processing on source numerosity judgments. Overall, factors that enhanced source segregation and speech stream coherence decreased errors in numerosity judgments. A stimulus-derived metric-the composite of glimpsed energy retained for all talkers in the sound field-was found to be a reasonably accurate predictor of the subjective numerosity judgments.

Abstract Image

查看原文本刊更多论文

声波束形成辅助下判断听觉场景中说话者的数量和性别。

与单通道波束形成器和自然聆听相比，使用同时的、空间分离的语音源的感知数量来评估三波束形成器处理的有效性。参与者对在模拟声场中出现的说话者的总数和说话者群体的性别构成做出判断。对于三个以上的谈话者的群体，所感知到的人数总是被低估了。使用三波束形成器的性能大致相当于自然聆听，包括方位角源空间分离的有利影响。说话组的性别构成也会影响对数字的判断，尽管即使在总人数被低估的情况下，感知到的性别比例总体上是准确的。在自然和三波束形成的情况下，语音时间反转会导致较低的数判断（误差增加），这表明语言处理对源数判断有影响。总的来说，增强源隔离和语音流连贯性的因素减少了数量判断的错误。一种刺激衍生的度量——所有说话者在声场中保留的瞥见能量的综合——被发现是一种相当准确的主观数字判断的预测器。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Trends in Hearing AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGYOTORH-OTORHINOLARYNGOLOGY

CiteScore

4.50

自引率

11.10%

发文量

审稿时长

12 weeks

期刊介绍： Trends in Hearing is an open access journal completely dedicated to publishing original research and reviews focusing on human hearing, hearing loss, hearing aids, auditory implants, and aural rehabilitation. Under its former name, Trends in Amplification, the journal established itself as a forum for concise explorations of all areas of translational hearing research by leaders in the field. Trends in Hearing has now expanded its focus to include original research articles, with the goal of becoming the premier venue for research related to human hearing and hearing loss.