Gino Brunner, Nawel Naas, Sveinn Pálsson, Oliver Richter, Roger Wattenhofer
{"title":"Monaural Music Source Separation using a ResNet Latent Separator Network","authors":"Gino Brunner, Nawel Naas, Sveinn Pálsson, Oliver Richter, Roger Wattenhofer","doi":"10.1109/ICTAI.2019.00157","DOIUrl":null,"url":null,"abstract":"In this paper we study the problem of monaural music source separation, where a piece of music is to be separated into its main constituent sources. We propose a simple yet effective deep neural network architecture based on a ResNet autoencoder. We investigate several data augmentation and post-processing methods to improve the separation results and outperform various state of the art monaural source separation methods on the DSD100 and MUSDB18 datasets. Our results suggest that in order to further push the state of the art in monaural music source separation we need more data, better data augmentation methods, as well as more effective post-processing methods; and not necessarily ever more complex neural network architectures.","PeriodicalId":346657,"journal":{"name":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","volume":" 2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTAI.2019.00157","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
In this paper we study the problem of monaural music source separation, where a piece of music is to be separated into its main constituent sources. We propose a simple yet effective deep neural network architecture based on a ResNet autoencoder. We investigate several data augmentation and post-processing methods to improve the separation results and outperform various state of the art monaural source separation methods on the DSD100 and MUSDB18 datasets. Our results suggest that in order to further push the state of the art in monaural music source separation we need more data, better data augmentation methods, as well as more effective post-processing methods; and not necessarily ever more complex neural network architectures.