2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)最新文献

筛选
英文 中文
Mean-Square-Error-Based Secondary Source Placement in Sound Field Synthesis with Prior Information on Desired Field 声场合成中基于均方误差的次声源放置方法
2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) Pub Date : 2021-10-17 DOI: 10.1109/WASPAA52581.2021.9632799
Keisuke Kimura, Shoichi Koyama, Natsuki Ueno, H. Saruwatari
{"title":"Mean-Square-Error-Based Secondary Source Placement in Sound Field Synthesis with Prior Information on Desired Field","authors":"Keisuke Kimura, Shoichi Koyama, Natsuki Ueno, H. Saruwatari","doi":"10.1109/WASPAA52581.2021.9632799","DOIUrl":"https://doi.org/10.1109/WASPAA52581.2021.9632799","url":null,"abstract":"A method of optimizing secondary source placement in sound field synthesis is proposed. Such an optimization method will be useful when the allowable placement region and available number of loudspeakers are limited. We formulate a mean-square-error-based cost function, incorporating the statistical properties of possible desired sound fields, for general linear-least-squares-based sound field synthesis methods, including pressure matching and (weighted) mode matching, whereas most of the current methods are applicable only to the pressure-matching method. An efficient greedy algorithm for minimizing the proposed cost function is also derived. Numerical experiments indicated that a high reproduction accuracy can be achieved by the placement optimized by the proposed method compared with the empirically used regular placement.","PeriodicalId":429900,"journal":{"name":"2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126632598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Closing the Gap Between Time-Domain Multi-Channel Speech Enhancement on Real and Simulation Conditions 缩小真实和仿真条件下时域多通道语音增强的差距
2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) Pub Date : 2021-10-17 DOI: 10.1109/WASPAA52581.2021.9632720
Wangyou Zhang, Jing Shi, Chenda Li, Shinji Watanabe, Y. Qian
{"title":"Closing the Gap Between Time-Domain Multi-Channel Speech Enhancement on Real and Simulation Conditions","authors":"Wangyou Zhang, Jing Shi, Chenda Li, Shinji Watanabe, Y. Qian","doi":"10.1109/WASPAA52581.2021.9632720","DOIUrl":"https://doi.org/10.1109/WASPAA52581.2021.9632720","url":null,"abstract":"The deep learning based time-domain models, e.g. Conv-TasNet, have shown great potential in both single-channel and multi-channel speech enhancement. However, many experiments on the time-domain speech enhancement model are done in simulated conditions, and it is not well studied whether the good performance can generalize to real-world scenarios. In this paper, we aim to provide an insightful investigation of applying multi-channel Conv-TasNet based speech enhancement to both simulation and real data. Our preliminary experiments show a large performance gap between the two conditions in terms of the ASR performance. Several approaches are applied to close this gap, including the integration of multi-channel Conv-TasNet into the beamforming model with various strategies, and the joint training of speech enhancement and speech recognition models. Our experiments on the CHiME-4 corpus show that our proposed approaches can greatly reduce the speech recognition performance discrepancy between simulation and real data, while preserving the strong speech enhancement capability in the frontend.","PeriodicalId":429900,"journal":{"name":"2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127033692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Prediction of Missing Frequency Response Functions Through Deep Image Prior 利用深度图像先验预测缺失频响函数
2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) Pub Date : 2021-10-17 DOI: 10.1109/WASPAA52581.2021.9632759
R. Malvermi, F. Antonacci, A. Sarti, R. Corradi
{"title":"Prediction of Missing Frequency Response Functions Through Deep Image Prior","authors":"R. Malvermi, F. Antonacci, A. Sarti, R. Corradi","doi":"10.1109/WASPAA52581.2021.9632759","DOIUrl":"https://doi.org/10.1109/WASPAA52581.2021.9632759","url":null,"abstract":"Vibration analysis is crucial when designing and monitoring resonant structures. The characterization of vibrational properties in mechanical systems, e.g. machinery or musical instruments, can indeed detect noise sources and damages. Several methods can retrieve these parameters starting from a set of measurements. The level of detail in the estimate mostly depends on the amount and distribution of points acquired over space. A potential issue for these techniques consists in the presence of regions over the object where sensors cannot be attached. In this case, an interpolation scheme with a suitable prior of the data model should be devised. We propose here to predict the missing vibrational data within the framework of image inpainting and apply a fully data-driven method based on Deep Image Prior, which allows to capture the prior inside data without the need of a dataset. The performance is assessed in the case of violin top plates. The proposed method proved to better predict data, in particular resonances, for points close to the boundary, whereas a baseline based on Thin Plate Splines fails, due to the reduced number of available samples.","PeriodicalId":429900,"journal":{"name":"2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125900784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Analysis of Frequency-Dependent Behavior of Room Reflections Using Spherical Microphone Measurements & Von Mises-Fisher Clustering 利用球形麦克风测量和Von Mises-Fisher聚类分析房间反射的频率依赖行为
2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) Pub Date : 2021-10-17 DOI: 10.1109/WASPAA52581.2021.9632706
Amy Bastine, T. Abhayapala, J. Zhang
{"title":"Analysis of Frequency-Dependent Behavior of Room Reflections Using Spherical Microphone Measurements & Von Mises-Fisher Clustering","authors":"Amy Bastine, T. Abhayapala, J. Zhang","doi":"10.1109/WASPAA52581.2021.9632706","DOIUrl":"https://doi.org/10.1109/WASPAA52581.2021.9632706","url":null,"abstract":"This paper presents a room acoustic analysis tool capable of power response generation and directional characterization of room reflections across different frequencies using spherical microphone array measurements. The method exploits the spatial correlation between the frequency-dependent spherical harmonic coefficients of the reverberant soundfield and extracts its statistical features using von Mises-Fisher (vMF) clustering. We use this tool to examine the acoustic response of a small and a large room to achieve a profound understanding of the frequency-related variations in the directional characteristics of room reflections. In comparison to the eigen-beam multiple signal classification (EB-MUSIC) method, the proposed technique incorporates a more realistic room response over a broader frequency range. The experimental observations prove the potential of the proposed tool in determining the frequency-dependent room acoustic parameters and can lead to the design of smarter room acoustic treatment solutions.","PeriodicalId":429900,"journal":{"name":"2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)","volume":"297 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123744202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stochastic Reverberation Model with a Frequency Dependent Attenuation 具有频率相关衰减的随机混响模型
2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) Pub Date : 2021-10-17 DOI: 10.1109/WASPAA52581.2021.9632792
Achille Aknin, Roland Badeau
{"title":"Stochastic Reverberation Model with a Frequency Dependent Attenuation","authors":"Achille Aknin, Roland Badeau","doi":"10.1109/WASPAA52581.2021.9632792","DOIUrl":"https://doi.org/10.1109/WASPAA52581.2021.9632792","url":null,"abstract":"In various audio signal processing applications, such as source separation and dereverberation, accurate mathematical modeling of both source signals and room reverberation is needed to properly describe the audio data. In a previous paper, we introduced a stochastic room impulse response model based on the image source principle, and we proposed an expectation-maximization algorithm that was able to efficiently estimate the model parameters in various experimental settings. This paper aims to extend the model in order to account for the dependency of the exponential decay over frequency, due to the walls usually absorbing less energy at low frequencies than at high frequencies. Our experimental results show that this refinement of the model is able to generate realistic room impulse responses, that are perceptively very close to the original ones.","PeriodicalId":429900,"journal":{"name":"2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128245636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Rendering of Source Spread for Arbitrary Playback Setups Based on Spatial Covariance Matching 基于空间协方差匹配的任意播放设置的源扩展渲染
2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) Pub Date : 2021-10-17 DOI: 10.1109/WASPAA52581.2021.9632724
L. McCormack, A. Politis, V. Pulkki
{"title":"Rendering of Source Spread for Arbitrary Playback Setups Based on Spatial Covariance Matching","authors":"L. McCormack, A. Politis, V. Pulkki","doi":"10.1109/WASPAA52581.2021.9632724","DOIUrl":"https://doi.org/10.1109/WASPAA52581.2021.9632724","url":null,"abstract":"This paper proposes an algorithm for rendering spread sound sources, which are mutually incoherent across their extents, over arbitrary playback formats. The approach involves first generating signals corresponding to the centre of the spread source for the intended playback setup, along with decorrelated variants, followed by defining a diffuse spatial covariance matrix for the confined target spreading area. The mixing matrices required to combine these signals, in a manner whereby the resulting output signals exhibit the target inter-channel relationships for an incoherently spread source, are computed based on an optimised solution which is constrained to preserve signal fidelity. The proposed solution is evaluated in the context of producing extended sound sources for binaural playback. Objective perceptual metrics are computed and shown to be comparable to those derived from an ideal incoherently spread reference. Signal distortion measures are also calculated for speech, musical, and ambience recordings, which indicate higher signal fidelity produced by the proposed constrained spatial covariance matching solution, compared to an unconstrained alternative. These improvements in signal fidelity are further demonstrated by the provided audio examples and open-source audio plug-in.","PeriodicalId":429900,"journal":{"name":"2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128869120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
User-Guided One-Shot Deep Model Adaptation for Music Source Separation 用户引导的单镜头深度模型自适应音乐源分离
2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) Pub Date : 2021-10-17 DOI: 10.1109/WASPAA52581.2021.9632717
Giorgia Cantisani, A. Ozerov, S. Essid, G. Richard
{"title":"User-Guided One-Shot Deep Model Adaptation for Music Source Separation","authors":"Giorgia Cantisani, A. Ozerov, S. Essid, G. Richard","doi":"10.1109/WASPAA52581.2021.9632717","DOIUrl":"https://doi.org/10.1109/WASPAA52581.2021.9632717","url":null,"abstract":"Music source separation is the task of isolating individual instruments which are mixed in a musical piece. This task is particularly challenging, and even state-of-the-art models can hardly generalize to unseen test data. Nevertheless, prior knowledge about individual sources can be used to better adapt a generic source separation model to the observed signal. In this work, we propose to exploit a temporal segmentation provided by the user, that indicates when each instrument is active, in order to fine-tune a pre-trained deep model for source separation and adapt it to one specific mixture. This paradigm can be referred to as user-driven one-shot deep model adaptation for music source separation, as the adaptation acts on the target song instance only. Our results are promising and show that state-of-the-art source separation models have large margins of improvement especially for those instruments which are underrepresented in the training data.","PeriodicalId":429900,"journal":{"name":"2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125073100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Spatial Coding for Microphone Arrays Using Ipnlms-Based RTF Estimation 基于ipnlms的RTF估计的麦克风阵列空间编码
2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) Pub Date : 2021-10-17 DOI: 10.1109/WASPAA52581.2021.9632747
Daniel T. Jones, D. Sharma, S. Kruchinin, P. Naylor
{"title":"Spatial Coding for Microphone Arrays Using Ipnlms-Based RTF Estimation","authors":"Daniel T. Jones, D. Sharma, S. Kruchinin, P. Naylor","doi":"10.1109/WASPAA52581.2021.9632747","DOIUrl":"https://doi.org/10.1109/WASPAA52581.2021.9632747","url":null,"abstract":"We propose a method for encoding multichannel microphone array signals and show that our proposed algorithm can operate effectively at very low bitrates. Our approach leverages the high interchannel correlations that arise from the close proximity of microphones in an array to compactly represent the signals. An $M$ channel microphone array signal is encoded as one reference signal and $M-1$ Relative Transfer Functions (RTFs). When the RTFs require updating only infrequently, a significant reduction in data-rate is obtained. Applications of interest include cloud-based beamforming and End-to-End Automatic Speech Recognition (ASR) systems. The efficiency of this encoding enables multichannel audio to be transmitted to the cloud at very low bitrates. A system has been developed that estimates, and periodically updates, the RTFs between each channel of the array and a chosen reference channel using an Improved Proportionate Normalized Least Mean Squares (IPNLMS) adaptive filter. The proposed system is experimentally evaluated in comparison with the Opus codec. It achieves equal ΔPESQ performance with a data-rate reduction of up to 90% and un-degraded Word Error Rate (WER) down to bitrates as low as 3.3 kbps.","PeriodicalId":429900,"journal":{"name":"2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127593280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Spherical Array Based Drone Noise Measurements and Modelling for Drone Noise Reduction via Propeller Phase Control 基于球面阵的无人机噪声测量及螺旋桨相位控制降噪建模
2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) Pub Date : 2021-10-17 DOI: 10.1109/WASPAA52581.2021.9632719
Hanwen Bi, Fei Ma, T. Abhayapala, P. Samarasinghe
{"title":"Spherical Array Based Drone Noise Measurements and Modelling for Drone Noise Reduction via Propeller Phase Control","authors":"Hanwen Bi, Fei Ma, T. Abhayapala, P. Samarasinghe","doi":"10.1109/WASPAA52581.2021.9632719","DOIUrl":"https://doi.org/10.1109/WASPAA52581.2021.9632719","url":null,"abstract":"Drone noise is increasingly becoming an annoying problem as they are widely used in everyday applications. This paper investigates the problem of controlling farfield drone noise by manipulating relative phase of propellers. The methodology includes (i) measurement of the nearfield propeller noise using a specially designed open spherical array, (ii) development of extrapolation method to transform nearfield noise to a farfield target region, and (iii) simulation of farfield noise field with varying propeller relative phases. We further investigate the influence of drone configuration of phase controlled noise reduction for a farfield target region, and show that −6.8 dB noise reduction can be achieved at the blade passage frequencies. The analysis of residual noise shows the potential benefit of combining phase control with active noise control.","PeriodicalId":429900,"journal":{"name":"2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126642235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Spatial Subtraction of Reflections from Room Impulse Responses Measured with a Spherical Microphone Array 用球形麦克风阵列测量房间脉冲响应反射的空间减法
2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) Pub Date : 2021-10-17 DOI: 10.1109/WASPAA52581.2021.9632764
T. Deppisch, J. Ahrens, S. V. A. Garí, P. Calamia
{"title":"Spatial Subtraction of Reflections from Room Impulse Responses Measured with a Spherical Microphone Array","authors":"T. Deppisch, J. Ahrens, S. V. A. Garí, P. Calamia","doi":"10.1109/WASPAA52581.2021.9632764","DOIUrl":"https://doi.org/10.1109/WASPAA52581.2021.9632764","url":null,"abstract":"We propose a method for the decomposition of measured directional room impulse responses (DRIRs) into prominent reflections and a residual. The method comprises obtaining a fingerprint of the time-frequency signal that a given reflection carries, imposing this time-frequency fingerprint on a plane-wave prototype that exhibits the same propagation direction as the reflection, and finally subtracting this plane-wave prototype from the DRIR. Our main contributions are the formulation of the problem as a spatial subtraction as well as the incorporation of order truncation, spatial aliasing and regularization of the radial filters into the definition of the underlying beamforming problem. We demonstrate, based on simulated as well as measured array impulse responses, that our method increases the accuracy of the model of the reflection under test and consequently decreases the energy of the residual that remains in a measured DRIR after the spatial subtraction.","PeriodicalId":429900,"journal":{"name":"2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116847238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信