{"title":"基于瞬时估计的谱减法声源分离","authors":"K. Ozawa, M. Morise, S. Sakamoto","doi":"10.1109/ICSAI.2018.8599483","DOIUrl":null,"url":null,"abstract":"This project aims to achieve sound source separation based on the two-dimensional fast Fourier transform (2D FFT) of a spatio-temporal sound pressure distribution image consisting of the outputs of a microphone array. The target sound, which arrives from the front of the array, forms vertical stripes in the image. Therefore, its spectral components are perfectly localized as direct current (DC) components along the spatial frequency axis in the 2D-FFT spectrum. In this study, noise suppression was performed by spectral subtraction after the DC components of noise were instantaneously estimated from the spectrum using artificial neural networks. As a result, the performance of the proposed method with a 14-cm-long array was comparable to that of the conventional delay and sum beamformer method with an approximately 5-m-long array.","PeriodicalId":375852,"journal":{"name":"2018 5th International Conference on Systems and Informatics (ICSAI)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Sound Source Separation by Instantaneous Estimation-Based Spectral Subtraction\",\"authors\":\"K. Ozawa, M. Morise, S. Sakamoto\",\"doi\":\"10.1109/ICSAI.2018.8599483\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This project aims to achieve sound source separation based on the two-dimensional fast Fourier transform (2D FFT) of a spatio-temporal sound pressure distribution image consisting of the outputs of a microphone array. The target sound, which arrives from the front of the array, forms vertical stripes in the image. Therefore, its spectral components are perfectly localized as direct current (DC) components along the spatial frequency axis in the 2D-FFT spectrum. In this study, noise suppression was performed by spectral subtraction after the DC components of noise were instantaneously estimated from the spectrum using artificial neural networks. As a result, the performance of the proposed method with a 14-cm-long array was comparable to that of the conventional delay and sum beamformer method with an approximately 5-m-long array.\",\"PeriodicalId\":375852,\"journal\":{\"name\":\"2018 5th International Conference on Systems and Informatics (ICSAI)\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 5th International Conference on Systems and Informatics (ICSAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSAI.2018.8599483\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 5th International Conference on Systems and Informatics (ICSAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSAI.2018.8599483","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Sound Source Separation by Instantaneous Estimation-Based Spectral Subtraction
This project aims to achieve sound source separation based on the two-dimensional fast Fourier transform (2D FFT) of a spatio-temporal sound pressure distribution image consisting of the outputs of a microphone array. The target sound, which arrives from the front of the array, forms vertical stripes in the image. Therefore, its spectral components are perfectly localized as direct current (DC) components along the spatial frequency axis in the 2D-FFT spectrum. In this study, noise suppression was performed by spectral subtraction after the DC components of noise were instantaneously estimated from the spectrum using artificial neural networks. As a result, the performance of the proposed method with a 14-cm-long array was comparable to that of the conventional delay and sum beamformer method with an approximately 5-m-long array.