Maria Klara Jedrzejewska, Adrian Zjawinski, Bartlomiej Stasiak
{"title":"Generating Musical Expression of MIDI Music with LSTM Neural Network","authors":"Maria Klara Jedrzejewska, Adrian Zjawinski, Bartlomiej Stasiak","doi":"10.1109/HSI.2018.8431033","DOIUrl":null,"url":null,"abstract":"Musicians aim to express emotions through musical performances. Technically, musical expression is mainly created by variances in tempo and dynamics. The purpose of this paper is to investigate the possibility of generating dynamics and expressive tempo for plain (inexpressive) MIDI files by means of a long short-term memory (LSTM) artificial neural network. Two neural network models (for dynamics and tempo separately) were built with the use of Keras deep learning library and trained on a dataset consisting of Chopin's mazurkas. The trained models are capable of generating expressive performance of inexpressive mazurka represented in MIDI format. The generated performances are evaluated by comparing the resulting dynamics and tempo graphs to human performances and by a survey testing the ease of differentiation between human and generated performance. The conclusion of the research is that expression generated with LSTM network can be very similar to human expression and convincing for listeners.","PeriodicalId":441117,"journal":{"name":"2018 11th International Conference on Human System Interaction (HSI)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 11th International Conference on Human System Interaction (HSI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HSI.2018.8431033","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Musicians aim to express emotions through musical performances. Technically, musical expression is mainly created by variances in tempo and dynamics. The purpose of this paper is to investigate the possibility of generating dynamics and expressive tempo for plain (inexpressive) MIDI files by means of a long short-term memory (LSTM) artificial neural network. Two neural network models (for dynamics and tempo separately) were built with the use of Keras deep learning library and trained on a dataset consisting of Chopin's mazurkas. The trained models are capable of generating expressive performance of inexpressive mazurka represented in MIDI format. The generated performances are evaluated by comparing the resulting dynamics and tempo graphs to human performances and by a survey testing the ease of differentiation between human and generated performance. The conclusion of the research is that expression generated with LSTM network can be very similar to human expression and convincing for listeners.