{"title":"The effects of ethnic bias and face on identification, accentedness judgements and intelligibility of Cantonese accents in Hong Kong.","authors":"Grace Wenling Cao, Him Cheung, Peggy Mok","doi":"10.1121/10.0035783","DOIUrl":"https://doi.org/10.1121/10.0035783","url":null,"abstract":"<p><p>Social information such as a talker's ethnicity, gender, and age are found to affect accent perception and attitudes. While existing research primarily focuses on English-dominant communities, this study aims to fill the gap by examining the impacts of ethnic bias and face on three Cantonese accents in Hong Kong. Nine groups of 20 Hong Kong Cantonese listeners were exposed to three Cantonese accents (i.e., Hong Kong local Cantonese, Mandarin-accented, and English-accented Cantonese) in three conditions of visual cues (i.e., a silhouette, a South Asian face and a White face). For accent identification, seeing a South Asian face in a mismatch condition led to more errors compared to seeing a White face in the same condition. For intelligibility, an enhancement of intelligibility was found when the face and accent were misaligned (e.g., an English accent matched with a South Asian face), supporting the general adaptation mechanism instead of the expectation mechanism. We argue that listeners might perceive South Asian and White faces as the same broad social category \"foreigners/outgroup members,\" resulting in a similar enhancement effect in the aligned and misaligned conditions. A dual-activation mechanism is proposed to account for the complementary effect of phonological and visual cues on accent perception.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"157 3","pages":"1618-1631"},"PeriodicalIF":2.1,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143567476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stephen Schade, Roberto Merino-Martinez, Antoine Moreau, Susanne Bartels, Robert Jaron
{"title":"Psychoacoustic evaluation of different fan designs for an urban air mobility vehicle with distributed propulsion systema).","authors":"Stephen Schade, Roberto Merino-Martinez, Antoine Moreau, Susanne Bartels, Robert Jaron","doi":"10.1121/10.0036228","DOIUrl":"https://doi.org/10.1121/10.0036228","url":null,"abstract":"<p><p>Distributed propulsion systems are developed to power a new generation of aircraft. However, it is not known yet which noise emissions these propulsion systems produce, which psychoacoustic characteristics such systems exhibit, and how the generated noise is perceived. This paper investigates how fans with fewer stator than rotor blades affect the noise perception of a distributed propulsion system intended for an urban air mobility vehicle, which is equipped with 26 low-speed ducted fans. Three fan designs with different tonal to broadband noise ratio and opposite dominant noise radiation directions are examined. An analytical process is applied to determine the noise emission, propagate the sound through the atmosphere, auralize the flyover signals, and calculate psychoacoustic metrics. A validation and comparison with A320 turbofan engines at takeoff is provided. The results indicate that the distributed propulsion system generates noise signatures with complex directional characteristics and high sharpness. By applying tonal noise reduction mechanisms at source, a significant effective perceived noise level reduction is achieved for the considered fan stages with fewer stator than rotor blades. In addition, tonality, loudness and roughness are reduced well above one noticeable difference compared to a baseline fan and similar or even lower values are achieved than with turbofans.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"157 3","pages":"2150-2167"},"PeriodicalIF":2.1,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143709916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Auditory localization of multiple stationary electric vehicles.","authors":"Leon Müller, Jens Forssén, Wolfgang Kropp","doi":"10.1121/10.0036248","DOIUrl":"https://doi.org/10.1121/10.0036248","url":null,"abstract":"<p><p>Current regulations require electric vehicles to be equipped with acoustic vehicle alerting systems (AVAS), radiating artificial warning sounds at low driving speeds. The requirements for these sounds are based on human subject studies, primarily estimating detection time for single vehicles. This paper presents a listening experiment assessing the accuracy and time of localization using a concealed array of 24 loudspeakers. Static single- and multiple-vehicle scenarios were compared using combustion engine noise, a two-tone AVAS, a multi-tone AVAS, and a narrowband noise AVAS. The results of 52 participants show a significant effect of the sound type on localization accuracy and time for all evaluated scenarios (p<0.001). Post-hoc tests revealed that the two-tone AVAS is localized significantly worse than the other signals, especially when simultaneously presenting two or three vehicles with the same type of sound. The multi-tone and noise AVAS are generally on par but localized worse than combustion noise for multi-vehicle scenarios. For multiple vehicles, the percentage of failed localizations drastically increased for all three AVAS signals, with the two-tone AVAS performing worst. These results indicate that signals typically performing well in a single-vehicle detection task are not necessarily easy to localize, especially not in multi-vehicle scenarios.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"157 3","pages":"2029-2041"},"PeriodicalIF":2.1,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143692448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marija Stepanović, Christian Hardmeier, Odette Scharenborg
{"title":"Formant-based vowel categorization for cross-lingual phone recognition.","authors":"Marija Stepanović, Christian Hardmeier, Odette Scharenborg","doi":"10.1121/10.0036222","DOIUrl":"https://doi.org/10.1121/10.0036222","url":null,"abstract":"<p><p>Multilingual phone recognition models can learn language-independent pronunciation patterns from large volumes of spoken data and recognize them across languages. This potential can be harnessed to improve speech technologies for underresourced languages. However, these models are typically trained on phonological representations of speech sounds, which do not necessarily reflect the phonetic realization of speech. A mismatch between a phonological symbol and its phonetic realizations can lead to phone confusions and reduce performance. This work introduces formant-based vowel categorization aimed at improving cross-lingual vowel recognition by uncovering a vowel's phonetic quality from its formant frequencies, and reorganizing the vowel categories in a multilingual speech corpus to increase their consistency across languages. The work investigates vowel categories obtained from a trilingual multi-dialect speech corpus of Danish, Norwegian, and Swedish using three categorization techniques. Cross-lingual phone recognition experiments reveal that uniting vowel categories of different languages into a set of shared formant-based categories improves cross-lingual recognition of the shared vowels, but also interferes with recognition of vowels not present in one or more training languages. Cross-lingual evaluation on regional dialects provides inconclusive results. Nevertheless, improved recognition of individual vowels can translate to improvements in overall phone recognition on languages unseen during training.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"157 3","pages":"2248-2262"},"PeriodicalIF":2.1,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143719929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptation rate and persistence across multiple sets of spectral cues for sound localization.","authors":"Paul Friedrich, Marc Schönwiesner","doi":"10.1121/10.0036056","DOIUrl":"https://doi.org/10.1121/10.0036056","url":null,"abstract":"<p><p>The adult auditory system adapts to changes in spectral cues for sound localization. This plasticity was demonstrated by modifying the shape of the pinnae with molds. Previous studies investigating this adaptation process have focused on the effects of learning one additional set of spectral cues. However, adaptation to multiple pinna shapes could reveal limitations in the auditory system's ability to encode discrete spectral-to-spatial mappings without interference and thus help determine the mechanism underlying spectral cue relearning. In the present study, listeners learned to localize sounds with two different sets of earmolds within consecutive adaptation periods. To establish both representations in quick succession, participants underwent daily sessions of sensory-motor training. Both pinna modifications severely disrupted vertical sound localization, but participants recovered within each 5-day adaptation period. After the second adaptation, listeners were able to access three different sets of spectral cues for sound localization. Participants adapted to both sets of earmolds with equal success, and learning a second set of modified cues did not interfere with the previous adaptation. We found no indication of meta-adaptation as the rate of adaptation to the second molds was not increased.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"157 3","pages":"1543-1553"},"PeriodicalIF":2.1,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143542424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mouth rhythm as a \"packaging mechanism\" of information in speech: A proof of concept.","authors":"Lei He","doi":"10.1121/10.0035944","DOIUrl":"https://doi.org/10.1121/10.0035944","url":null,"abstract":"<p><p>This paper postulated and tested the possibility that the mouth rhythm functions as a \"packaging mechanism\" of information in speech. Cross-spectral analysis between two time series of mouth aperture size [parameterized as sample-by-sample interlip distances, i.e., o(t)] and information variations in speech [parameterized as frame-by-frame spectral entropy values, i.e., h(t)] was employed to reveal their underlying spectro-temporal relationship. Using a corpus containing more than 1000 utterances produced by a typical British English speaker, it was observed that both signals share slow recurring rates corresponding to the stress and syllable, with a slight phase lag of h(t) behind o(t) in the vicinity of 5 Hz.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"157 3","pages":"1612-1617"},"PeriodicalIF":2.1,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143542428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Patti Adank, Han Wang, Taylor Hepworth, Stephanie A Borrie
{"title":"Perceptual adaptation to dysarthric speech is modulated by concurrent phonological processing: A dual task study.","authors":"Patti Adank, Han Wang, Taylor Hepworth, Stephanie A Borrie","doi":"10.1121/10.0035883","DOIUrl":"10.1121/10.0035883","url":null,"abstract":"<p><p>Listeners can adapt to noise-vocoded speech under divided attention using a dual task design [Wang, Chen, Yan, McGettigan, Rosen, and Adank, Trends Hear. 27, 23312165231192297 (2023)]. Adaptation to noise-vocoded speech, an artificial degradation, was largely unaffected for domain-general (visuomotor) and domain-specific (semantic or phonological) dual tasks. The study by Wang et al. was replicated in an online between-subject experiment with 4 conditions (N = 192) using 40 dysarthric sentences, a natural, real-world variation of the speech signal listeners can adapt to, to provide a closer test of the role of attention in adaptation. Participants completed a speech-only task (control) or a dual task, aiming to recruit domain-specific (phonological or lexical) or domain-general (visual) attentional processes. The results showed initial suppression of adaptation in the phonological condition during the first ten trials in addition to poorer overall speech comprehension compared to the speech-only, lexical, and visuomotor conditions. Yet, as there was no difference in the rate of adaptation across the 40 trials for the 4 conditions, it was concluded that perceptual adaptation to dysarthric speech could occur under divided attention, and it seems likely that adaptation is an automatic cognitive process that can occur under load.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"157 3","pages":"1598-1611"},"PeriodicalIF":2.1,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11905114/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143586048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mingrui Liu, James W Wiskin, Gregory J Czarnota, Michael L Oelze
{"title":"Angular spatial compounding of diffraction corrected images improves ultrasound attenuation measurements.","authors":"Mingrui Liu, James W Wiskin, Gregory J Czarnota, Michael L Oelze","doi":"10.1121/10.0036124","DOIUrl":"10.1121/10.0036124","url":null,"abstract":"<p><p>Breast cancer is a leading cause of death for women. Quantitative ultrasound (QUS) and ultrasound computed tomography (USCT) are quantitative imaging techniques that have been investigated for management of breast cancer. QUS and USCT can generate ultrasound attenuation images. In QUS, the spectral log difference (SLD) is a technique that can provide estimates of the attenuation coefficient slope. Full angular spatial compounding (FASC) can be used with SLD to generate attenuation maps with better spatial resolution and lower estimate variance. In USCT, high quality speed of sound (SOS) images can be generated using full wave inversion (FWI) method, but attenuation images created using FWI are often of inferior quality. With the QTI Breast Acoustic CTTM Scanner (QT Imaging, Inc., Novato, CA), raw in-phase and quadrature data were used to implement SLD combined with FASC. The capabilities of SLD were compared with FWI through simulations, phantom experiments, and in vivo breast experiments. Results show the SLD resulted in improved accuracy in estimating lesion sizes compared to FWI. Further, SLD images had lower variance and mean absolute error (MAE) compared to FWI of the same samples with respect to the attenuation values (reducing MAE by three times) in the tissue mimicking phantoms.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"157 3","pages":"1638-1649"},"PeriodicalIF":2.1,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11890159/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143575703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rouben Rehman, Christian Dreier, Jonas Heck, Josep Llorca-Bofí, Michael Vorländer
{"title":"Comparison of virtual reality and web-based listening experiments on the perception in complex auralized environmentsa).","authors":"Rouben Rehman, Christian Dreier, Jonas Heck, Josep Llorca-Bofí, Michael Vorländer","doi":"10.1121/10.0036147","DOIUrl":"https://doi.org/10.1121/10.0036147","url":null,"abstract":"<p><p>Listening experiments are crucial for understanding human sound perception. In overall human perception, combined audiovisual effects play an important role. However, traditional virtual reality (VR) setups, consisting of a head-mounted display (HMD) and headphones, are limited by their need for expensive equipment and time-consuming laboratory sessions. Striving for alternatives, online experiments have demonstrated their potential in other areas of research. However, these experiments have been restricted to basic setups lacking interactivity. This study presents a web-based approach with audiovisual experiments being run on a server and streamed in real time. To this end, two reproduction setups are compared: an immersive laboratory setup (HMD-based visualization with controller navigation and headphones) and a consumer setup (screen-based visualization with keyboard navigation and headphones). The experiment comprises quality ratings and noise assessments of four auralized noise conditions with additional visualization. For noise perception experiments, the results are promising, showing minimal differences in questionnaire ratings between VR and streaming reproduction. Visual quality ratings suffered mildly in the consumer setup, but auralization quality was rated similarly positive in both cases. Even for lower feeling of presence in the consumer setup, the subjects' attention remained similarly high. Finally, accessibility and quality ratings indicate promising results, too.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"157 3","pages":"2001-2017"},"PeriodicalIF":2.1,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143692505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dorian S Houser, Kyle Donohoe, Jason Mulsow, James J Finneran
{"title":"Quantifying differences in dolphin hearing thresholds obtained with behavioral and auditory evoked potential methods.","authors":"Dorian S Houser, Kyle Donohoe, Jason Mulsow, James J Finneran","doi":"10.1121/10.0036153","DOIUrl":"https://doi.org/10.1121/10.0036153","url":null,"abstract":"<p><p>Different methods of producing the auditory steady state response (ASSR) are used to test dolphin hearing, but each method affects the resulting ASSR threshold. Since behavioral thresholds are often desired, this study, using common ASSR methods, compared differences between ASSR and behavioral hearing thresholds in five dolphins. Sinusoidal amplitude modulated (SAM) tones or tone pip trains were presented to the dolphins through a contact transducer while they were in air or partially submerged under water. Underwater behavioral hearing thresholds were obtained with pure tone stimuli on the same days as ASSR testing. Independent of the test medium, SAM tone stimuli yielded thresholds that consistently overestimated (i.e., were higher than) behavioral thresholds. Tone pip trains consistently underestimated thresholds when presented in air, and while they underestimated thresholds at lower test frequencies, they overestimated thresholds at higher test frequencies when presented under water. The mean differences between ASSR and behavioral thresholds were almost always lower when using tone pip train stimuli, but were exaggerated up to -47 dB when testing frequencies just above the upper-frequency limit of hearing. Knowing the relationship between ASSR and behavioral thresholds enables better approximations of behavioral thresholds in dolphins for which only ASSR thresholds exist.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"157 3","pages":"1955-1968"},"PeriodicalIF":2.1,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143674221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}