A. Serra, A. Busson, Alan Livio Vasconcelos Guedes, S. Colcher
{"title":"使用基于深度学习的丢失频率预测模型来提高高度退化音乐的质量","authors":"A. Serra, A. Busson, Alan Livio Vasconcelos Guedes, S. Colcher","doi":"10.1145/3470482.3479635","DOIUrl":null,"url":null,"abstract":"Audio quality degradation can have many causes. For musical applications, this fragmentation may lead to highly unpleasant experiences. Restoration algorithms may be employed to reconstruct missing parts of the audio in a similar way as for image reconstruction --- in an approach called audio inpainting. Current state-of-the art methods for audio inpainting cover limited scenarios, with well-defined gap windows and little variety of musical genres. In this work, we propose a Deep-Learning-based (DL-based) method for audio inpainting accompanied by a dataset with random fragmentation conditions that approximate real impairment situations. The dataset was collected using tracks from different music genres to provide a good signal variability. Our best model improved the quality of all musical genres, obtaining an average of 12.9 dB of PSNR, although it worked better for musical genres in which acoustic instruments are predominant.","PeriodicalId":350776,"journal":{"name":"Proceedings of the Brazilian Symposium on Multimedia and the Web","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Quality Enhancement of Highly Degraded Music Using Deep Learning-Based Prediction Models for Lost Frequencies\",\"authors\":\"A. Serra, A. Busson, Alan Livio Vasconcelos Guedes, S. Colcher\",\"doi\":\"10.1145/3470482.3479635\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Audio quality degradation can have many causes. For musical applications, this fragmentation may lead to highly unpleasant experiences. Restoration algorithms may be employed to reconstruct missing parts of the audio in a similar way as for image reconstruction --- in an approach called audio inpainting. Current state-of-the art methods for audio inpainting cover limited scenarios, with well-defined gap windows and little variety of musical genres. In this work, we propose a Deep-Learning-based (DL-based) method for audio inpainting accompanied by a dataset with random fragmentation conditions that approximate real impairment situations. The dataset was collected using tracks from different music genres to provide a good signal variability. Our best model improved the quality of all musical genres, obtaining an average of 12.9 dB of PSNR, although it worked better for musical genres in which acoustic instruments are predominant.\",\"PeriodicalId\":350776,\"journal\":{\"name\":\"Proceedings of the Brazilian Symposium on Multimedia and the Web\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Brazilian Symposium on Multimedia and the Web\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3470482.3479635\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Brazilian Symposium on Multimedia and the Web","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3470482.3479635","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Quality Enhancement of Highly Degraded Music Using Deep Learning-Based Prediction Models for Lost Frequencies
Audio quality degradation can have many causes. For musical applications, this fragmentation may lead to highly unpleasant experiences. Restoration algorithms may be employed to reconstruct missing parts of the audio in a similar way as for image reconstruction --- in an approach called audio inpainting. Current state-of-the art methods for audio inpainting cover limited scenarios, with well-defined gap windows and little variety of musical genres. In this work, we propose a Deep-Learning-based (DL-based) method for audio inpainting accompanied by a dataset with random fragmentation conditions that approximate real impairment situations. The dataset was collected using tracks from different music genres to provide a good signal variability. Our best model improved the quality of all musical genres, obtaining an average of 12.9 dB of PSNR, although it worked better for musical genres in which acoustic instruments are predominant.