生成式lstm系统中显式音乐特征的交互控制

Maximos A. Kaliakatsos-Papakostas, Aggelos Gkiokas, V. Katsouros
{"title":"生成式lstm系统中显式音乐特征的交互控制","authors":"Maximos A. Kaliakatsos-Papakostas, Aggelos Gkiokas, V. Katsouros","doi":"10.1145/3243274.3243296","DOIUrl":null,"url":null,"abstract":"Long Short-Term Memory (LSTM) neural networks have been effectively applied on learning and generating musical sequences, powered by sophisticated musical representations and integrations into other deep learning models. Deep neural networks, alongside LSTM-based systems, learn implicitly: given a sufficiently large amount of data, they transform information into high-level features that, however, do not relate with the high-level features perceived by humans. For instance, such models are able to compose music in the style of the Bach chorales, but they are not able to compose a less rhythmically dense version of them, or a Bach choral that begins with low and ends with high pitches -- even more so in an interactive way in real-time. This paper presents an approach to creating such systems. A very basic LSTM-based architecture is developed that can compose music that corresponds to user-provided values of rhythm density and pitch height/register. A small initial dataset is augmented to incorporate more intense variations of these two features and the system learns and generates music that not only reflects the style, but also (and most importantly) reflects the features that are explicitly given as input at each specific time. This system -- and future versions that will incorporate more advanced architectures and representation -- is suitable for generating music the features of which are defined in real-time and/or interactively.","PeriodicalId":129628,"journal":{"name":"Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion","volume":"108 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Interactive Control of Explicit Musical Features in Generative LSTM-based Systems\",\"authors\":\"Maximos A. Kaliakatsos-Papakostas, Aggelos Gkiokas, V. Katsouros\",\"doi\":\"10.1145/3243274.3243296\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Long Short-Term Memory (LSTM) neural networks have been effectively applied on learning and generating musical sequences, powered by sophisticated musical representations and integrations into other deep learning models. Deep neural networks, alongside LSTM-based systems, learn implicitly: given a sufficiently large amount of data, they transform information into high-level features that, however, do not relate with the high-level features perceived by humans. For instance, such models are able to compose music in the style of the Bach chorales, but they are not able to compose a less rhythmically dense version of them, or a Bach choral that begins with low and ends with high pitches -- even more so in an interactive way in real-time. This paper presents an approach to creating such systems. A very basic LSTM-based architecture is developed that can compose music that corresponds to user-provided values of rhythm density and pitch height/register. A small initial dataset is augmented to incorporate more intense variations of these two features and the system learns and generates music that not only reflects the style, but also (and most importantly) reflects the features that are explicitly given as input at each specific time. This system -- and future versions that will incorporate more advanced architectures and representation -- is suitable for generating music the features of which are defined in real-time and/or interactively.\",\"PeriodicalId\":129628,\"journal\":{\"name\":\"Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion\",\"volume\":\"108 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3243274.3243296\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3243274.3243296","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

摘要

长短期记忆(LSTM)神经网络已经有效地应用于学习和生成音乐序列,由复杂的音乐表示和集成到其他深度学习模型中。深度神经网络与基于lstm的系统一起进行隐式学习:给定足够大的数据量,它们将信息转换为高级特征,然而,与人类感知的高级特征无关。例如,这样的模型能够创作巴赫合唱风格的音乐,但它们不能创作节奏不那么密集的版本,或者巴赫合唱以低音开始,以高音结束——在实时互动的方式下更是如此。本文提出了一种创建这种系统的方法。开发了一个非常基本的基于lstm的体系结构,可以根据用户提供的节奏密度和音高/音域值作曲。一个小的初始数据集被扩展到包含这两个特征的更强烈的变化,系统学习并生成的音乐不仅反映了风格,而且(最重要的是)反映了在每个特定时间作为输入明确给出的特征。这个系统——以及未来的版本将包含更先进的架构和表现形式——适合生成实时和/或交互式定义的音乐。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Interactive Control of Explicit Musical Features in Generative LSTM-based Systems
Long Short-Term Memory (LSTM) neural networks have been effectively applied on learning and generating musical sequences, powered by sophisticated musical representations and integrations into other deep learning models. Deep neural networks, alongside LSTM-based systems, learn implicitly: given a sufficiently large amount of data, they transform information into high-level features that, however, do not relate with the high-level features perceived by humans. For instance, such models are able to compose music in the style of the Bach chorales, but they are not able to compose a less rhythmically dense version of them, or a Bach choral that begins with low and ends with high pitches -- even more so in an interactive way in real-time. This paper presents an approach to creating such systems. A very basic LSTM-based architecture is developed that can compose music that corresponds to user-provided values of rhythm density and pitch height/register. A small initial dataset is augmented to incorporate more intense variations of these two features and the system learns and generates music that not only reflects the style, but also (and most importantly) reflects the features that are explicitly given as input at each specific time. This system -- and future versions that will incorporate more advanced architectures and representation -- is suitable for generating music the features of which are defined in real-time and/or interactively.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信