基于自回归多态音符模型的钢琴复调转写

Taegyun Kwon, Dasaem Jeong, Juhan Nam
{"title":"基于自回归多态音符模型的钢琴复调转写","authors":"Taegyun Kwon, Dasaem Jeong, Juhan Nam","doi":"10.5281/ZENODO.4245466","DOIUrl":null,"url":null,"abstract":"Recent advances in polyphonic piano transcription have been made primarily by a deliberate design of neural network architectures that detect different note states such as onset or sustain and model the temporal evolution of the states. The majority of them, however, use separate neural networks for each note state, thereby optimizing multiple loss functions, and also they handle the temporal evolution of note states by abstract connections between the state-wise neural networks or using a post-processing module. In this paper, we propose a unified neural network architecture where multiple note states are predicted as a softmax output with a single loss function and the temporal order is learned by an auto-regressive connection within the single neural network. This compact model allows to increase note states without architectural complexity. Using the MAESTRO dataset, we examine various combinations of multiple note states including on, onset, sustain, re-onset, offset, and off. We also show that the autoregressive module effectively learns inter-state dependency of notes. Finally, we show that our proposed model achieves performance comparable to state-of-the-arts with fewer parameters.","PeriodicalId":309903,"journal":{"name":"International Society for Music Information Retrieval Conference","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Polyphonic Piano Transcription Using Autoregressive Multi-State Note Model\",\"authors\":\"Taegyun Kwon, Dasaem Jeong, Juhan Nam\",\"doi\":\"10.5281/ZENODO.4245466\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent advances in polyphonic piano transcription have been made primarily by a deliberate design of neural network architectures that detect different note states such as onset or sustain and model the temporal evolution of the states. The majority of them, however, use separate neural networks for each note state, thereby optimizing multiple loss functions, and also they handle the temporal evolution of note states by abstract connections between the state-wise neural networks or using a post-processing module. In this paper, we propose a unified neural network architecture where multiple note states are predicted as a softmax output with a single loss function and the temporal order is learned by an auto-regressive connection within the single neural network. This compact model allows to increase note states without architectural complexity. Using the MAESTRO dataset, we examine various combinations of multiple note states including on, onset, sustain, re-onset, offset, and off. We also show that the autoregressive module effectively learns inter-state dependency of notes. Finally, we show that our proposed model achieves performance comparable to state-of-the-arts with fewer parameters.\",\"PeriodicalId\":309903,\"journal\":{\"name\":\"International Society for Music Information Retrieval Conference\",\"volume\":\"42 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Society for Music Information Retrieval Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5281/ZENODO.4245466\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Society for Music Information Retrieval Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5281/ZENODO.4245466","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

摘要

最近在复调钢琴转录方面取得的进展主要是通过故意设计神经网络架构来检测不同的音符状态,如开始或持续,并对状态的时间演变进行建模。然而,它们中的大多数对每个音符状态使用单独的神经网络,从而优化多个损失函数,并且它们还通过状态神经网络之间的抽象连接或使用后处理模块来处理音符状态的时间演变。在本文中,我们提出了一种统一的神经网络架构,其中多个音符状态被预测为具有单个损失函数的softmax输出,并且通过单个神经网络内的自回归连接学习时间顺序。这种紧凑的模型允许在不增加架构复杂性的情况下增加注释状态。使用MAESTRO数据集,我们检查多个音符状态的各种组合,包括开、开始、维持、重新开始、偏移和关闭。我们还证明了自回归模块有效地学习了音符的状态间依赖。最后,我们证明了我们提出的模型在参数较少的情况下实现了与最先进的性能相当的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Polyphonic Piano Transcription Using Autoregressive Multi-State Note Model
Recent advances in polyphonic piano transcription have been made primarily by a deliberate design of neural network architectures that detect different note states such as onset or sustain and model the temporal evolution of the states. The majority of them, however, use separate neural networks for each note state, thereby optimizing multiple loss functions, and also they handle the temporal evolution of note states by abstract connections between the state-wise neural networks or using a post-processing module. In this paper, we propose a unified neural network architecture where multiple note states are predicted as a softmax output with a single loss function and the temporal order is learned by an auto-regressive connection within the single neural network. This compact model allows to increase note states without architectural complexity. Using the MAESTRO dataset, we examine various combinations of multiple note states including on, onset, sustain, re-onset, offset, and off. We also show that the autoregressive module effectively learns inter-state dependency of notes. Finally, we show that our proposed model achieves performance comparable to state-of-the-arts with fewer parameters.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信