{"title":"Effects of Torso Location and Rotation to HRTF","authors":"Jaan Johansson, A. Mäkivirta, Matti Malinen","doi":"10.17743/jaes.2022.0152","DOIUrl":"https://doi.org/10.17743/jaes.2022.0152","url":null,"abstract":"The significance of representing realistic torso orientation relative to the head in the head-related transfer function (HRTF) is studied in this work. Actual head position relative to the torso is found for 195 persons. The effect of the head position in HRTF is studied by modifying the 3D model of a Kemar head-and-torso simulator geometry by translating the head relative to torso in up-down and forward-backward directions and rotating the torso. The spectral difference is compared to that seen in the closest matching actual persons. Forward-backward location of the head has the strongest influence in the HRTF. The spectral difference between the fixed and rotated torso spectra can exceed a 1-dB limit for all sound arrival azimuth directions when the torso rotation exceeds 10°. The spectral difference decreases with increasing source elevation. A subjective listening test with personal HRTF demonstrates that the spectral effect of the torso rotation are audible as a sound color and location changes. The HRTF data in this work is found by calculating the sound field using the boundary element method and the 3D shape of the person acquired using photogrammetry.","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141826125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Perceptual Comparison of Dynamic Binaural Reproduction Methods for Sparse Head-Mounted Microphone Arrays","authors":"Benjamin Stahl, Stefan Riedel","doi":"10.17743/jaes.2022.0140","DOIUrl":"https://doi.org/10.17743/jaes.2022.0140","url":null,"abstract":"This paper presents results of a listening experiment evaluating three-degrees-of-freedom binaural reproduction of head-mounted microphone array signals. The methods are applied to an array of five microphones whose signals were simulated for static and dynamic array orientations. Methods under test involve scene-agnostic binaural reproduction methods as well as methods that have knowledge of (a subset of) source directions. The results of an instrumental evaluation reveal errors in the reproduction of interaural level and time differences for all scene-agnostic methods, which are smallest for the end-to-end magnitude-least-squares method. Additionally, the inherent localization robustness of the array under test and different simulated microphone arrays is investigated and discussed, which is of interest for a parametric reproduction method included in the study. In the listening experiment, the end-to-end magnitude-least-squares reproduction method outperforms other scene-agnostic approaches. Above all, linearly constrained beamformers using known source directions in combination with the end-to-end magnitude-least-squares method outcompete the scene-agnostic methods in perceived quality, especially for a rotating microphone array under anechoic conditions.","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141824152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Patrick Cairns, Tomasz Rudzki, J. Cooper, Anthony Hunt, Kim Steele, Gerardo Acosta Martínez, Andrew Chadwick, Helena Daffern, Gavin Kearney
{"title":"Singer and Audience Evaluations of a Networked Immersive Audio Concert","authors":"Patrick Cairns, Tomasz Rudzki, J. Cooper, Anthony Hunt, Kim Steele, Gerardo Acosta Martínez, Andrew Chadwick, Helena Daffern, Gavin Kearney","doi":"10.17743/jaes.2022.0145","DOIUrl":"https://doi.org/10.17743/jaes.2022.0145","url":null,"abstract":"At the 2023 AES International Conference on Spatial and Immersive Audio, a networked immersive audio concert was performed. A vocal octet connected over the Internet between York and Huddersfield and provided a performance that was auralized in the acoustics of BBC Maida Vale Studio 2. A live audience in Huddersfield experienced the concert with local singers on stage, remote singers auralized alongside, and virtual acoustics rendered on a multichannel array. Another audience in York listened to the concert on headphones. An evaluation of the networked concert experience of the performers and audience is presented in this paper. Results demonstrate that a generally high-quality experience was delivered. Audience response to immersive audio rating items demonstrates a variance in experience. Several aspects of the evaluation context are identified as relevant to this rating variance and discussed as open challenges for audio engineers.","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141827043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aimée Moulson, Max Walley, Yannik Grewe, Rob Oldfield, Ben Shirley, Ulli Scuda
{"title":"Practical Implementation of Automated Next Generation Audio Production for Live Sports","authors":"Aimée Moulson, Max Walley, Yannik Grewe, Rob Oldfield, Ben Shirley, Ulli Scuda","doi":"10.17743/jaes.2022.0151","DOIUrl":"https://doi.org/10.17743/jaes.2022.0151","url":null,"abstract":"Producing a high-quality audio mix for a live sports production is a demanding task for mixing engineers. The management of many microphone signals and monitoring of various broadcast feeds mean engineers are often stretched, overseeing many tasks simultaneously. With the advancements in Next Generation Audio codecs providing many appealing features, such as interactivity and personalization to end users, consideration is needed as not to create further work for production staff. Therefore, the authors propose a novel approach to live sports production by combining an object-based audio workflow with the efficiency benefits of automated mixing. This paper describes how a fully object-based workflow can be built from point of capture to audience playback with minimal changes for the production staff. This was achieved by integrating Next Generation Audio authoring from the point of production, streamlining the workflow, and thus removing the need for additional authoring process later in the chain. As an exemplar, the authors applied this approach to a Premier League football match in a proof-of-concept trial.","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141826952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Spatial Sampling of Binaural Room Transfer Functions for Head-Tracked Personal Sound Zones","authors":"Yue Qiao, Jessica Luo, Edgar Choueiri","doi":"10.17743/jaes.2022.0144","DOIUrl":"https://doi.org/10.17743/jaes.2022.0144","url":null,"abstract":"The spatial sampling of binaural room transfer functions that vary with listener movements, as required for rendering personal sound zone (PSZ) with head tracking, was experimentally investigated regarding its dependencies on various factors. Through measurements of the binaural room transfer functions in a practical PSZ system with either translational or rotational movements of one of the two mannequin listeners, the PSZ filters were generated along the measurement grid and then spatially downsampled to different resolutions, at which the isolation performance of the system was numerically simulated. It was found that the spatial sampling resolution generally depends on factors such as the moving listener’s position, frequency band of the rendered audio, and perturbation caused by the other listener. More specifically, the required sampling resolution is inversely proportional to the distance either between two listeners or between the moving listener and the loudspeakers and is proportional to the frequency of the rendered audio. The perturbation caused by the other listener may impair both the isolation performance and filter robustness against movements. Furthermore, two crossover frequencies were found to exist in the system, which divide the frequency band into three sub-bands, each with a distinctive requirement for spatial sampling.","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141827122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Practitioners' Perspectives on Spatial Audio: Insights into Dolby Atmos and Binaural Mixes in Popular Music","authors":"Christopher Dewey, Austin Moore, Hyunkook Lee","doi":"10.17743/jaes.2022.0153","DOIUrl":"https://doi.org/10.17743/jaes.2022.0153","url":null,"abstract":"This paper presents the practitioners’ perspective on mixing popular music in spatial audio, particularly Dolby Atmos and the binaural mixes generated by the Dolby and Apple renderers. It presents the results of a dual-stage study, which utilized focus groups with eight professional music producers and a questionnaire completed by 140 practitioners. Analysis revealed the continued influence of stereo approaches on mix engineers, partly due to its historical dominance as a production platform and consumers’ continued use of headphones. It was also found that core elements of popular music productions, such as snare drums, tom-tom drums, kick drums, bass guitars, main guitars, and vocals, were less likely to have binaural processing applied compared with other sources. It was also shown there were perceived differences in the suitability of spatial audio mixing for specific genres, with electronic dance music, jazz, pop, classical, and world music rated as the most suitable. Regarding the binaural renderers, there was less user satisfaction with the Apple device compared with Dolby’s, and this dissatisfaction manifested mainly in the need for more user control. Finally, mix engineers were very aware of the importance of their mixes translating to smaller speaker systems and headphone playback, in particular.","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141827633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
David Bau, Hendrik Himmelein, Christoph Pörschmann
{"title":"Comparison of Non-Parametric Interpolation Techniques for Sparsely Measured Binaural Room Impulse Responses","authors":"David Bau, Hendrik Himmelein, Christoph Pörschmann","doi":"10.17743/jaes.2022.0150","DOIUrl":"https://doi.org/10.17743/jaes.2022.0150","url":null,"abstract":"This study investigates different interpolation techniques for spatially upsampling Binaural Room Impulse Responses (BRIRs) measured on a sparse grid of view orientations. In this context, the authors recently presented the Spherical Array Interpolation by Time Alignment (SARITA) method for interpolating spherical microphone array signals with a limited number of microphones, which is adapted for the spatial upsampling of sparse BRIR datasets in the present work. SARITA is compared with two existing nonparametric BRIR-interpolation methods and naive linear interpolation. The study provides a technical and perceptual analysis of the interpolation performance. The results show the suitability of all interpolation methods apart from linear interpolation to achieving a realistic auralization, even for very sparse BRIR sets. For angular resolutions of 30° and real-world stimuli, most participants could not distinguish SARITA from an artifact-free reference.","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141827823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Letting Pulsars Sing: Sonification With Granular Synthesis","authors":"Mara Helmuth","doi":"10.17743/jaes.2022.0147","DOIUrl":"https://doi.org/10.17743/jaes.2022.0147","url":null,"abstract":"An astronomy sonification project has been initiated to create sound and music from the data of pulsars in space. Pulsars are formed when some stars burn out all of their fuel and emit electromagnetic radiation, which hits earth periodically as the pulsar rotates. Each pulsar has unique characteristics. The source of the data is the online Pulsar Catalog from the Australian National Telescope Facility. The first result is a stereo fixed media composition, From Orion to Cassiopeia, which reveals a sweep of much of the Milky Way, displaying audio for many of the known pulsars. Galactic longitude, rotation speed, pulse width, mean flux density, age, and distance are mapped to granular synthesis parameters. Sound event duration, amplitude, amount of reverberation, grain rate, grain duration, grain frequency, and panning are controlled by the data. The piece was created with the new SGRAN2() instrument in the RTcmix music programming language.","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141017205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Speech, Nonspeech Audio, and Visual Interruptions of a Tracking Task: A Replication and Extension of Nees and Sampsell (2021)","authors":"Michael A. Nees, Claire Liu, Krista Bogan","doi":"10.17743/jaes.2022.0142","DOIUrl":"https://doi.org/10.17743/jaes.2022.0142","url":null,"abstract":"Interruptions from technology—such as alerts from mobile communication devices—are a pervasive aspect of modern life. Interruptions can be detrimental to performance of the ongoing, interrupted task. Designers often can choose whether interruptions are delivered as visual or auditory alerts. Contradictory theories have emerged regarding whether auditory or visual alerts are more harmful to performance of ongoing visual tasks. Multiple Resources Theory predicts better overall performance with auditory alerts, but Auditory Preemption Theory predicts better overall performance with visual alerts. Nees and Sampsell previously found that multitasking was superior with nonspeech auditory alerts as compared to visual alerts. In the current experiment, their methods were replicated and extended to include a speech auditory alerts condition. Performance of the ongoing tracking task was worse with interruption from visual alerts, and perceived workload also was highest in this condition. Reaction time to alerts was fastest with visual alerts. There also was converging evidence to suggest that performance with speech alerts was superior to performance with nonspeech tonal alerts. The current experiment replicated the results of Nees and Sampsell and extended their findings to speech alert sounds. Like in their study, the pattern of findings here support Multiple Resources Theory over Auditory Preemption Theory.","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141015505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Natural Sonification Mapping for Handwriting","authors":"Katharina Groß-Vogt, Noah Rachdi, Matthias Frank","doi":"10.17743/jaes.2022.0148","DOIUrl":"https://doi.org/10.17743/jaes.2022.0148","url":null,"abstract":"The sonification of handwriting has been shown effective in various learning tasks. In this paper, the authors investigate the sound design used for handwriting interaction based on a simple and cost-efficient prototype. The authentic interaction sound is compared with physically informed sonification designs that employ either natural or inverted mapping. In an experiment, participants copied text and drawings. The authors found simple measures of the structure-borne audio signal that showed how participants were affected in their movements, but only when drawing. In contrast, participants rated the sound features differently only for writing. The authentic interaction sound generally scored best, followed by a natural sonification mapping.","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141017521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}