{"title":"Improve symbolic music pre-training model using MusicTransformer structure","authors":"Yingfeng Fu, Y. Tanimura, H. Nakada","doi":"10.1109/IMCOM56909.2023.10035617","DOIUrl":null,"url":null,"abstract":"Pre-training driven by vast data has shown great power in natural language understanding. The idea has also been applied to symbolic music. However, many existing works using pre-training for symbolic music are not general enough to tackle all the tasks in musical information retrieval, and there is still space to improve the model structure. To make up for the insufficiency and compare it with the existing works, we employed a BERT-like masked language pre-training approach to train a stacked MusicTransformer on MAESTRO dataset. Then we fine-tuned our pre-trained model on several symbolic music understanding tasks. In the work, our contribution is 1)we improved MusicBERT by modifying the model structure. 2)be-sides the existing evaluation downstream tasks, we complemented several downstream tasks, including melody extraction, emotion classification, and composer classification. We pre-trained the modified model and existing works under the same condition. We make a comparison of our pre-trained model with the previous works. The result shows that the modified model is more powerful than the previous models with the same pre-training setting.","PeriodicalId":230213,"journal":{"name":"2023 17th International Conference on Ubiquitous Information Management and Communication (IMCOM)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 17th International Conference on Ubiquitous Information Management and Communication (IMCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IMCOM56909.2023.10035617","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Pre-training driven by vast data has shown great power in natural language understanding. The idea has also been applied to symbolic music. However, many existing works using pre-training for symbolic music are not general enough to tackle all the tasks in musical information retrieval, and there is still space to improve the model structure. To make up for the insufficiency and compare it with the existing works, we employed a BERT-like masked language pre-training approach to train a stacked MusicTransformer on MAESTRO dataset. Then we fine-tuned our pre-trained model on several symbolic music understanding tasks. In the work, our contribution is 1)we improved MusicBERT by modifying the model structure. 2)be-sides the existing evaluation downstream tasks, we complemented several downstream tasks, including melody extraction, emotion classification, and composer classification. We pre-trained the modified model and existing works under the same condition. We make a comparison of our pre-trained model with the previous works. The result shows that the modified model is more powerful than the previous models with the same pre-training setting.