基于自关注的递归神经网络生成原创音乐

2022 IEEE International Conference On Artificial Intelligence Testing (AITest) Pub Date : 2022-08-01 DOI:10.1109/AITest55621.2022.00017

Akash Jagannathan, Bharathi Chandrasekaran, Shubham Dutta, U. Patil, M. Eirinaki

{"title":"基于自关注的递归神经网络生成原创音乐","authors":"Akash Jagannathan, Bharathi Chandrasekaran, Shubham Dutta, U. Patil, M. Eirinaki","doi":"10.1109/AITest55621.2022.00017","DOIUrl":null,"url":null,"abstract":"A recent trend in deep learning is using state-of-the-art models to generate human art forms. Using such “intelligent” models to generate novel musical compositions is a thriving area of research. The motivation is to use the capacity of deep learning architectures and training techniques to learn musical styles from arbitrary musical corpora automatically and then generate samples from the estimated distribution. We focus on two popular state-of-the-art models used in deep generative learning of music, namely recursive neural networks (RNN) and the self-attention mechanism. We provide a systematic evaluation of state-of-the-art models used in generative deep learning for music but also contribute novel architectures and compare them to the established baselines. The models are trained on piano compositions embedded in MIDI format from Google’s Maestro dataset. A big challenge in such learning tasks is to evaluate the outcome of such learning tasks, since art is very subjective and hard to evaluate quantitatively. Therefore, in addition to the experimental evaluation, we also conduct a blind user study. We conclude that a double-stacked RNN model with a self-attention layer was observed to have the most optimal training time, and the pieces generated by a triple-stacked RNN model with self-attention layers were deemed the most subjectively appealing and authentic.","PeriodicalId":427386,"journal":{"name":"2022 IEEE International Conference On Artificial Intelligence Testing (AITest)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Original Music Generation using Recurrent Neural Networks with Self-Attention\",\"authors\":\"Akash Jagannathan, Bharathi Chandrasekaran, Shubham Dutta, U. Patil, M. Eirinaki\",\"doi\":\"10.1109/AITest55621.2022.00017\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A recent trend in deep learning is using state-of-the-art models to generate human art forms. Using such “intelligent” models to generate novel musical compositions is a thriving area of research. The motivation is to use the capacity of deep learning architectures and training techniques to learn musical styles from arbitrary musical corpora automatically and then generate samples from the estimated distribution. We focus on two popular state-of-the-art models used in deep generative learning of music, namely recursive neural networks (RNN) and the self-attention mechanism. We provide a systematic evaluation of state-of-the-art models used in generative deep learning for music but also contribute novel architectures and compare them to the established baselines. The models are trained on piano compositions embedded in MIDI format from Google’s Maestro dataset. A big challenge in such learning tasks is to evaluate the outcome of such learning tasks, since art is very subjective and hard to evaluate quantitatively. Therefore, in addition to the experimental evaluation, we also conduct a blind user study. We conclude that a double-stacked RNN model with a self-attention layer was observed to have the most optimal training time, and the pieces generated by a triple-stacked RNN model with self-attention layers were deemed the most subjectively appealing and authentic.\",\"PeriodicalId\":427386,\"journal\":{\"name\":\"2022 IEEE International Conference On Artificial Intelligence Testing (AITest)\",\"volume\":\"69 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference On Artificial Intelligence Testing (AITest)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AITest55621.2022.00017\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference On Artificial Intelligence Testing (AITest)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AITest55621.2022.00017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

深度学习的最新趋势是使用最先进的模型来生成人类艺术形式。使用这种“智能”模型生成新颖的音乐作品是一个蓬勃发展的研究领域。动机是利用深度学习架构和训练技术的能力，从任意的音乐语料库中自动学习音乐风格，然后从估计的分布中生成样本。我们专注于两种流行的最先进的模型用于音乐的深度生成学习，即递归神经网络(RNN)和自注意机制。我们对用于音乐生成深度学习的最先进模型进行了系统评估，但也提供了新的架构，并将其与已建立的基线进行了比较。这些模型是在谷歌Maestro数据集中嵌入MIDI格式的钢琴作品上进行训练的。这类学习任务的一大挑战是评估学习任务的结果，因为艺术是非常主观的，很难进行定量评估。因此，除了实验评估，我们还进行了盲用户研究。结果表明，双层自注意层RNN模型具有最优的训练时间，三层自注意层RNN模型生成的片段在主观上最吸引人、最真实。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Original Music Generation using Recurrent Neural Networks with Self-Attention

A recent trend in deep learning is using state-of-the-art models to generate human art forms. Using such “intelligent” models to generate novel musical compositions is a thriving area of research. The motivation is to use the capacity of deep learning architectures and training techniques to learn musical styles from arbitrary musical corpora automatically and then generate samples from the estimated distribution. We focus on two popular state-of-the-art models used in deep generative learning of music, namely recursive neural networks (RNN) and the self-attention mechanism. We provide a systematic evaluation of state-of-the-art models used in generative deep learning for music but also contribute novel architectures and compare them to the established baselines. The models are trained on piano compositions embedded in MIDI format from Google’s Maestro dataset. A big challenge in such learning tasks is to evaluate the outcome of such learning tasks, since art is very subjective and hard to evaluate quantitatively. Therefore, in addition to the experimental evaluation, we also conduct a blind user study. We conclude that a double-stacked RNN model with a self-attention layer was observed to have the most optimal training time, and the pieces generated by a triple-stacked RNN model with self-attention layers were deemed the most subjectively appealing and authentic.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE International Conference On Artificial Intelligence Testing (AITest)

自引率

0.00%

发文量