Martin Jälmby, Filip Elvander, Toon van Waterschoot
{"title":"压缩室内脉冲响应,实现紧凑存储和快速低延迟卷积","authors":"Martin Jälmby, Filip Elvander, Toon van Waterschoot","doi":"10.1186/s13636-024-00363-5","DOIUrl":null,"url":null,"abstract":"Room impulse responses (RIRs) are used in several applications, such as augmented reality and virtual reality. These applications require a large number of RIRs to be convolved with audio, under strict latency constraints. In this paper, we consider the compression of RIRs, in conjunction with fast time-domain convolution. We consider three different methods of RIR approximation for the purpose of RIR compression and compare them to state-of-the-art compression. The methods are evaluated using several standard objective quality measures, both channel-based and signal-based. We also propose a novel low-rank-based algorithm for fast time-domain convolution and show how the convolution can be carried out without the need to decompress the RIR. Numerical simulations are performed using RIRs of different lengths, recorded in three different rooms. It is shown that compression using low-rank approximation is a very compelling option to the state-of-the-art Opus compression, as it performs as well or better than on all but one considered measure, with the added benefit of being amenable to fast time-domain convolution.","PeriodicalId":49202,"journal":{"name":"Eurasip Journal on Audio Speech and Music Processing","volume":"16 1","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Compression of room impulse responses for compact storage and fast low-latency convolution\",\"authors\":\"Martin Jälmby, Filip Elvander, Toon van Waterschoot\",\"doi\":\"10.1186/s13636-024-00363-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Room impulse responses (RIRs) are used in several applications, such as augmented reality and virtual reality. These applications require a large number of RIRs to be convolved with audio, under strict latency constraints. In this paper, we consider the compression of RIRs, in conjunction with fast time-domain convolution. We consider three different methods of RIR approximation for the purpose of RIR compression and compare them to state-of-the-art compression. The methods are evaluated using several standard objective quality measures, both channel-based and signal-based. We also propose a novel low-rank-based algorithm for fast time-domain convolution and show how the convolution can be carried out without the need to decompress the RIR. Numerical simulations are performed using RIRs of different lengths, recorded in three different rooms. It is shown that compression using low-rank approximation is a very compelling option to the state-of-the-art Opus compression, as it performs as well or better than on all but one considered measure, with the added benefit of being amenable to fast time-domain convolution.\",\"PeriodicalId\":49202,\"journal\":{\"name\":\"Eurasip Journal on Audio Speech and Music Processing\",\"volume\":\"16 1\",\"pages\":\"\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2024-09-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Eurasip Journal on Audio Speech and Music Processing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1186/s13636-024-00363-5\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ACOUSTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Eurasip Journal on Audio Speech and Music Processing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1186/s13636-024-00363-5","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ACOUSTICS","Score":null,"Total":0}
引用次数: 0
摘要
室内脉冲响应(RIR)被用于增强现实和虚拟现实等多种应用中。这些应用需要在严格的延迟限制下,将大量 RIR 与音频进行卷积。在本文中,我们将结合快速时域卷积来考虑 RIR 的压缩问题。为了压缩 RIR,我们考虑了三种不同的 RIR 近似方法,并将它们与最先进的压缩方法进行了比较。我们使用几种标准的客观质量测量方法对这些方法进行了评估,既有基于信道的,也有基于信号的。我们还提出了一种基于低秩的新型快速时域卷积算法,并展示了如何在无需解压缩 RIR 的情况下进行卷积。我们使用在三个不同房间录制的不同长度的 RIR 进行了数值模拟。结果表明,与最先进的 Opus 压缩技术相比,使用低秩近似法进行压缩是一种非常有吸引力的选择,因为除了一项考虑的指标外,它在其他所有指标上的表现都与 Opus 压缩技术相当或更好,而且还具有适合快速时域卷积的额外优势。
Compression of room impulse responses for compact storage and fast low-latency convolution
Room impulse responses (RIRs) are used in several applications, such as augmented reality and virtual reality. These applications require a large number of RIRs to be convolved with audio, under strict latency constraints. In this paper, we consider the compression of RIRs, in conjunction with fast time-domain convolution. We consider three different methods of RIR approximation for the purpose of RIR compression and compare them to state-of-the-art compression. The methods are evaluated using several standard objective quality measures, both channel-based and signal-based. We also propose a novel low-rank-based algorithm for fast time-domain convolution and show how the convolution can be carried out without the need to decompress the RIR. Numerical simulations are performed using RIRs of different lengths, recorded in three different rooms. It is shown that compression using low-rank approximation is a very compelling option to the state-of-the-art Opus compression, as it performs as well or better than on all but one considered measure, with the added benefit of being amenable to fast time-domain convolution.
期刊介绍:
The aim of “EURASIP Journal on Audio, Speech, and Music Processing” is to bring together researchers, scientists and engineers working on the theory and applications of the processing of various audio signals, with a specific focus on speech and music. EURASIP Journal on Audio, Speech, and Music Processing will be an interdisciplinary journal for the dissemination of all basic and applied aspects of speech communication and audio processes.