Andrei Faitas, S. Baumann, Torgrim Rudland Næss, J. Tørresen, Charles Patrick Martin
{"title":"用简单的长短期记忆网络生成令人信服的和声部分","authors":"Andrei Faitas, S. Baumann, Torgrim Rudland Næss, J. Tørresen, Charles Patrick Martin","doi":"10.5281/zenodo.3672980","DOIUrl":null,"url":null,"abstract":"Generating convincing music via deep neural networks is a challenging problem that shows promise for many applications including interactive musical creation. One part of this challenge is the problem of generating convincing accompaniment parts to a given melody, as could be used in an automatic accompaniment system. Despite much progress in this area, systems that can automatically learn to generate interesting and harmonically plausible accompaniments remain somewhat elusive. In this paper we explore systems where a user provides a sequence of notes, and a neural network model responds with an accompanying sequence of equal length. We consider two popular sequenceto-sequence models; one featuring standard unidirectional long short-term memory (LSTM) architecture, and the other featuring bidirectional LSTM. These are evaluated and compared via a qualitative study that features 106 respondents listening to eight random samples from our set of generated music, as well as two human samples. From the results we see a preference for the sequences generated by the bidirectional model as well as an indication that these sequences sound more human.","PeriodicalId":161317,"journal":{"name":"New Interfaces for Musical Expression","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Generating Convincing Harmony Parts with Simple Long Short-Term Memory Networks\",\"authors\":\"Andrei Faitas, S. Baumann, Torgrim Rudland Næss, J. Tørresen, Charles Patrick Martin\",\"doi\":\"10.5281/zenodo.3672980\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Generating convincing music via deep neural networks is a challenging problem that shows promise for many applications including interactive musical creation. One part of this challenge is the problem of generating convincing accompaniment parts to a given melody, as could be used in an automatic accompaniment system. Despite much progress in this area, systems that can automatically learn to generate interesting and harmonically plausible accompaniments remain somewhat elusive. In this paper we explore systems where a user provides a sequence of notes, and a neural network model responds with an accompanying sequence of equal length. We consider two popular sequenceto-sequence models; one featuring standard unidirectional long short-term memory (LSTM) architecture, and the other featuring bidirectional LSTM. These are evaluated and compared via a qualitative study that features 106 respondents listening to eight random samples from our set of generated music, as well as two human samples. From the results we see a preference for the sequences generated by the bidirectional model as well as an indication that these sequences sound more human.\",\"PeriodicalId\":161317,\"journal\":{\"name\":\"New Interfaces for Musical Expression\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"New Interfaces for Musical Expression\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5281/zenodo.3672980\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"New Interfaces for Musical Expression","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5281/zenodo.3672980","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Generating Convincing Harmony Parts with Simple Long Short-Term Memory Networks
Generating convincing music via deep neural networks is a challenging problem that shows promise for many applications including interactive musical creation. One part of this challenge is the problem of generating convincing accompaniment parts to a given melody, as could be used in an automatic accompaniment system. Despite much progress in this area, systems that can automatically learn to generate interesting and harmonically plausible accompaniments remain somewhat elusive. In this paper we explore systems where a user provides a sequence of notes, and a neural network model responds with an accompanying sequence of equal length. We consider two popular sequenceto-sequence models; one featuring standard unidirectional long short-term memory (LSTM) architecture, and the other featuring bidirectional LSTM. These are evaluated and compared via a qualitative study that features 106 respondents listening to eight random samples from our set of generated music, as well as two human samples. From the results we see a preference for the sequences generated by the bidirectional model as well as an indication that these sequences sound more human.