{"title":"Spatial audio signal processing for augmented telepresence applications","authors":"Thomas Deppisch","doi":"10.1016/j.sctalk.2025.100421","DOIUrl":null,"url":null,"abstract":"<div><div>During the COVID-19 pandemic, the shift to remote communication, particularly through video calls, led to both opportunities and challenges. While initially a welcome alternative to in-person meetings, virtual gatherings became increasingly overwhelming, culminating in the term “zoom fatigue.” However, reduced travel highlighted the potential environmental benefits of online meetings. My PhD research focuses on improving the naturalness of remote communication to enhance the appeal of virtual meetings. Specifically, I develop signal processing techniques that preserve spatial and acoustic cues important for natural speech perception, such as the cocktail party effect. By modeling microphone array signals, particularly those integrated into smart glasses or augmented reality headsets, I estimate and apply spatial room transfer functions to create natural binaural audio experiences. My work also addresses challenges posed by head movement, using continuous-space domain estimation to update room transfer functions during head rotations. First results show the effectiveness of the method under controlled conditions. Future work will investigate the approach in more realistic scenarios.</div></div>","PeriodicalId":101148,"journal":{"name":"Science Talks","volume":"13 ","pages":"Article 100421"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science Talks","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772569325000039","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
During the COVID-19 pandemic, the shift to remote communication, particularly through video calls, led to both opportunities and challenges. While initially a welcome alternative to in-person meetings, virtual gatherings became increasingly overwhelming, culminating in the term “zoom fatigue.” However, reduced travel highlighted the potential environmental benefits of online meetings. My PhD research focuses on improving the naturalness of remote communication to enhance the appeal of virtual meetings. Specifically, I develop signal processing techniques that preserve spatial and acoustic cues important for natural speech perception, such as the cocktail party effect. By modeling microphone array signals, particularly those integrated into smart glasses or augmented reality headsets, I estimate and apply spatial room transfer functions to create natural binaural audio experiences. My work also addresses challenges posed by head movement, using continuous-space domain estimation to update room transfer functions during head rotations. First results show the effectiveness of the method under controlled conditions. Future work will investigate the approach in more realistic scenarios.