基于深度度量学习的歌唱评价

Proceedings of the 2019 3rd International Symposium on Computer Science and Intelligent Control Pub Date : 2019-09-25 DOI:10.1145/3386164.3389096

Terry Tan

{"title":"基于深度度量学习的歌唱评价","authors":"Terry Tan","doi":"10.1145/3386164.3389096","DOIUrl":null,"url":null,"abstract":"This paper aims to evaluate singing performance based on deep metric learning. As the vocal sound will be the input, we will first need to separate that from a soundtrack. After the separation, the vocal audio will be represented by Mel-spectrogram as an input in our proposed model. The process to build up our model splits into pre-training and training steps. Meta learning is adopted for pre-training while deep metric learning is adopted for training. The output of the model is a Euclidean distance reflecting the singers' performance, which is determined by comparing their sounds to the originals. Experimental results show a stable and reliable singing evaluation.","PeriodicalId":231209,"journal":{"name":"Proceedings of the 2019 3rd International Symposium on Computer Science and Intelligent Control","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Singing Evaluation based on Deep Metric Learning\",\"authors\":\"Terry Tan\",\"doi\":\"10.1145/3386164.3389096\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper aims to evaluate singing performance based on deep metric learning. As the vocal sound will be the input, we will first need to separate that from a soundtrack. After the separation, the vocal audio will be represented by Mel-spectrogram as an input in our proposed model. The process to build up our model splits into pre-training and training steps. Meta learning is adopted for pre-training while deep metric learning is adopted for training. The output of the model is a Euclidean distance reflecting the singers' performance, which is determined by comparing their sounds to the originals. Experimental results show a stable and reliable singing evaluation.\",\"PeriodicalId\":231209,\"journal\":{\"name\":\"Proceedings of the 2019 3rd International Symposium on Computer Science and Intelligent Control\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2019 3rd International Symposium on Computer Science and Intelligent Control\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3386164.3389096\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 3rd International Symposium on Computer Science and Intelligent Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3386164.3389096","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

本文旨在基于深度度量学习的歌唱表演评价。因为人声是输入，所以我们首先需要将其与原声区分开来。分离后的语音音频将用梅尔谱图表示，作为我们提出的模型的输入。建立模型的过程分为预训练和训练两个步骤。预训练采用元学习，训练采用深度度量学习。该模型的输出是反映歌手表演的欧几里得距离，这是通过将他们的声音与原声进行比较而确定的。实验结果表明，该评价方法稳定可靠。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Singing Evaluation based on Deep Metric Learning

This paper aims to evaluate singing performance based on deep metric learning. As the vocal sound will be the input, we will first need to separate that from a soundtrack. After the separation, the vocal audio will be represented by Mel-spectrogram as an input in our proposed model. The process to build up our model splits into pre-training and training steps. Meta learning is adopted for pre-training while deep metric learning is adopted for training. The output of the model is a Euclidean distance reflecting the singers' performance, which is determined by comparing their sounds to the originals. Experimental results show a stable and reliable singing evaluation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2019 3rd International Symposium on Computer Science and Intelligent Control

自引率

0.00%

发文量