{"title":"用于背景音乐分离的具有扩张卷积的多波段多尺度DenseNet","authors":"Woon-Haeng Heo, Hyemi Kim, O. Kwon","doi":"10.7776/ASK.2019.38.6.697","DOIUrl":null,"url":null,"abstract":"We propose a multi-band multi-scale DenseNet with dilated convolution that separates background music signals from broadcast content. Dilated convolution can learn the multi-scale context information represented by spectrogram. In computer simulation experiments, the proposed architecture is shown to improve Signal to Distortion Ratio (SDR) by 0.15 dB and 0.27 dB in 0dB and –10 dB Signal to Noise Ratio (SNR) environments, respectively.","PeriodicalId":42689,"journal":{"name":"Journal of the Acoustical Society of Korea","volume":"38 1","pages":"697-702"},"PeriodicalIF":0.2000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-band multi-scale DenseNet with dilated convolution for background music separation\",\"authors\":\"Woon-Haeng Heo, Hyemi Kim, O. Kwon\",\"doi\":\"10.7776/ASK.2019.38.6.697\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose a multi-band multi-scale DenseNet with dilated convolution that separates background music signals from broadcast content. Dilated convolution can learn the multi-scale context information represented by spectrogram. In computer simulation experiments, the proposed architecture is shown to improve Signal to Distortion Ratio (SDR) by 0.15 dB and 0.27 dB in 0dB and –10 dB Signal to Noise Ratio (SNR) environments, respectively.\",\"PeriodicalId\":42689,\"journal\":{\"name\":\"Journal of the Acoustical Society of Korea\",\"volume\":\"38 1\",\"pages\":\"697-702\"},\"PeriodicalIF\":0.2000,\"publicationDate\":\"2019-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the Acoustical Society of Korea\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.7776/ASK.2019.38.6.697\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"ACOUSTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Acoustical Society of Korea","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.7776/ASK.2019.38.6.697","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ACOUSTICS","Score":null,"Total":0}
Multi-band multi-scale DenseNet with dilated convolution for background music separation
We propose a multi-band multi-scale DenseNet with dilated convolution that separates background music signals from broadcast content. Dilated convolution can learn the multi-scale context information represented by spectrogram. In computer simulation experiments, the proposed architecture is shown to improve Signal to Distortion Ratio (SDR) by 0.15 dB and 0.27 dB in 0dB and –10 dB Signal to Noise Ratio (SNR) environments, respectively.