基于自关注的递归神经网络生成原创音乐

Akash Jagannathan, Bharathi Chandrasekaran, Shubham Dutta, U. Patil, M. Eirinaki
{"title":"基于自关注的递归神经网络生成原创音乐","authors":"Akash Jagannathan, Bharathi Chandrasekaran, Shubham Dutta, U. Patil, M. Eirinaki","doi":"10.1109/AITest55621.2022.00017","DOIUrl":null,"url":null,"abstract":"A recent trend in deep learning is using state-of-the-art models to generate human art forms. Using such “intelligent” models to generate novel musical compositions is a thriving area of research. The motivation is to use the capacity of deep learning architectures and training techniques to learn musical styles from arbitrary musical corpora automatically and then generate samples from the estimated distribution. We focus on two popular state-of-the-art models used in deep generative learning of music, namely recursive neural networks (RNN) and the self-attention mechanism. We provide a systematic evaluation of state-of-the-art models used in generative deep learning for music but also contribute novel architectures and compare them to the established baselines. The models are trained on piano compositions embedded in MIDI format from Google’s Maestro dataset. A big challenge in such learning tasks is to evaluate the outcome of such learning tasks, since art is very subjective and hard to evaluate quantitatively. Therefore, in addition to the experimental evaluation, we also conduct a blind user study. We conclude that a double-stacked RNN model with a self-attention layer was observed to have the most optimal training time, and the pieces generated by a triple-stacked RNN model with self-attention layers were deemed the most subjectively appealing and authentic.","PeriodicalId":427386,"journal":{"name":"2022 IEEE International Conference On Artificial Intelligence Testing (AITest)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Original Music Generation using Recurrent Neural Networks with Self-Attention\",\"authors\":\"Akash Jagannathan, Bharathi Chandrasekaran, Shubham Dutta, U. Patil, M. Eirinaki\",\"doi\":\"10.1109/AITest55621.2022.00017\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A recent trend in deep learning is using state-of-the-art models to generate human art forms. Using such “intelligent” models to generate novel musical compositions is a thriving area of research. The motivation is to use the capacity of deep learning architectures and training techniques to learn musical styles from arbitrary musical corpora automatically and then generate samples from the estimated distribution. We focus on two popular state-of-the-art models used in deep generative learning of music, namely recursive neural networks (RNN) and the self-attention mechanism. We provide a systematic evaluation of state-of-the-art models used in generative deep learning for music but also contribute novel architectures and compare them to the established baselines. The models are trained on piano compositions embedded in MIDI format from Google’s Maestro dataset. A big challenge in such learning tasks is to evaluate the outcome of such learning tasks, since art is very subjective and hard to evaluate quantitatively. Therefore, in addition to the experimental evaluation, we also conduct a blind user study. We conclude that a double-stacked RNN model with a self-attention layer was observed to have the most optimal training time, and the pieces generated by a triple-stacked RNN model with self-attention layers were deemed the most subjectively appealing and authentic.\",\"PeriodicalId\":427386,\"journal\":{\"name\":\"2022 IEEE International Conference On Artificial Intelligence Testing (AITest)\",\"volume\":\"69 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference On Artificial Intelligence Testing (AITest)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AITest55621.2022.00017\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference On Artificial Intelligence Testing (AITest)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AITest55621.2022.00017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

深度学习的最新趋势是使用最先进的模型来生成人类艺术形式。使用这种“智能”模型生成新颖的音乐作品是一个蓬勃发展的研究领域。动机是利用深度学习架构和训练技术的能力,从任意的音乐语料库中自动学习音乐风格,然后从估计的分布中生成样本。我们专注于两种流行的最先进的模型用于音乐的深度生成学习,即递归神经网络(RNN)和自注意机制。我们对用于音乐生成深度学习的最先进模型进行了系统评估,但也提供了新的架构,并将其与已建立的基线进行了比较。这些模型是在谷歌Maestro数据集中嵌入MIDI格式的钢琴作品上进行训练的。这类学习任务的一大挑战是评估学习任务的结果,因为艺术是非常主观的,很难进行定量评估。因此,除了实验评估,我们还进行了盲用户研究。结果表明,双层自注意层RNN模型具有最优的训练时间,三层自注意层RNN模型生成的片段在主观上最吸引人、最真实。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Original Music Generation using Recurrent Neural Networks with Self-Attention
A recent trend in deep learning is using state-of-the-art models to generate human art forms. Using such “intelligent” models to generate novel musical compositions is a thriving area of research. The motivation is to use the capacity of deep learning architectures and training techniques to learn musical styles from arbitrary musical corpora automatically and then generate samples from the estimated distribution. We focus on two popular state-of-the-art models used in deep generative learning of music, namely recursive neural networks (RNN) and the self-attention mechanism. We provide a systematic evaluation of state-of-the-art models used in generative deep learning for music but also contribute novel architectures and compare them to the established baselines. The models are trained on piano compositions embedded in MIDI format from Google’s Maestro dataset. A big challenge in such learning tasks is to evaluate the outcome of such learning tasks, since art is very subjective and hard to evaluate quantitatively. Therefore, in addition to the experimental evaluation, we also conduct a blind user study. We conclude that a double-stacked RNN model with a self-attention layer was observed to have the most optimal training time, and the pieces generated by a triple-stacked RNN model with self-attention layers were deemed the most subjectively appealing and authentic.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信