Hybrid Learning Module-Based Transformer for Multitrack Music Generation With Music Theory

IF 4.5 2区 计算机科学 Q1 COMPUTER SCIENCE, CYBERNETICS
Yun Tie;Xin Guo;Donghui Zhang;Jiessie Tie;Lin Qi;Yuhang Lu
{"title":"Hybrid Learning Module-Based Transformer for Multitrack Music Generation With Music Theory","authors":"Yun Tie;Xin Guo;Donghui Zhang;Jiessie Tie;Lin Qi;Yuhang Lu","doi":"10.1109/TCSS.2024.3486604","DOIUrl":null,"url":null,"abstract":"In recent years, multitrack music generation has garnered significant attention in both academic and industrial spheres for its versatile utilization of various instruments in collaborative settings. The primary challenge lies in achieving a harmonious balance within individual tracks and fostering effective collaboration across multiple tracks. To address this issue, this article introduces a pioneering hybrid learning encoder architecture. Each music track's encoder is implemented as an independent transformer architecture, preserving self-attention mechanisms within a single track and interattention mechanisms between different tracks. The resulting features are then seamlessly integrated into the decoder through concatenation. Of particular significance, previous multitrack music generation efforts have predominantly operated under unconditional settings, yielding music that lacks practical value due to noncompliance with established music theory principles. Recognizing this limitation, the article proposes a novel approach to multitrack music generation guided by music theory rules. Employing reinforcement learning techniques, the decoder-generated music serves as the initial state. Positive feedback is provided when the generated music adheres to music theory rules; conversely, negative feedback is applied to compel the multitrack music to align with widely accepted music theory principles. Finally, comprehensive simulation validation is conducted on both the publicly available LMD dataset and the self-constructed MUT dataset. The plethora of experimental results overwhelmingly corroborates the efficacy of the proposed methodology.","PeriodicalId":13044,"journal":{"name":"IEEE Transactions on Computational Social Systems","volume":"12 2","pages":"862-872"},"PeriodicalIF":4.5000,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computational Social Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10758307/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, CYBERNETICS","Score":null,"Total":0}
引用次数: 0

Abstract

In recent years, multitrack music generation has garnered significant attention in both academic and industrial spheres for its versatile utilization of various instruments in collaborative settings. The primary challenge lies in achieving a harmonious balance within individual tracks and fostering effective collaboration across multiple tracks. To address this issue, this article introduces a pioneering hybrid learning encoder architecture. Each music track's encoder is implemented as an independent transformer architecture, preserving self-attention mechanisms within a single track and interattention mechanisms between different tracks. The resulting features are then seamlessly integrated into the decoder through concatenation. Of particular significance, previous multitrack music generation efforts have predominantly operated under unconditional settings, yielding music that lacks practical value due to noncompliance with established music theory principles. Recognizing this limitation, the article proposes a novel approach to multitrack music generation guided by music theory rules. Employing reinforcement learning techniques, the decoder-generated music serves as the initial state. Positive feedback is provided when the generated music adheres to music theory rules; conversely, negative feedback is applied to compel the multitrack music to align with widely accepted music theory principles. Finally, comprehensive simulation validation is conducted on both the publicly available LMD dataset and the self-constructed MUT dataset. The plethora of experimental results overwhelmingly corroborates the efficacy of the proposed methodology.
基于混合学习模块的多轨音乐生成转换器
近年来,多轨音乐因其在协作环境中对各种乐器的多功能利用而在学术和工业领域引起了极大的关注。主要的挑战在于在各个方面实现和谐的平衡,并在多个方面促进有效的合作。为了解决这个问题,本文介绍了一种开创性的混合学习编码器架构。每个音乐轨道的编码器被实现为一个独立的变压器架构,在单个轨道内保留自注意机制和不同轨道之间的相互注意机制。由此产生的特征然后通过连接无缝集成到解码器中。特别重要的是,以前的多轨音乐制作工作主要是在无条件的设置下进行的,由于不遵守既定的音乐理论原则,产生的音乐缺乏实用价值。认识到这一局限性,本文提出了一种以乐理规则为指导的多轨音乐生成新方法。采用强化学习技术,解码器生成的音乐作为初始状态。当生成的音乐符合乐理规则时,会提供正反馈;相反,负反馈被用来迫使多轨音乐与广泛接受的音乐理论原则保持一致。最后,对公开可用的LMD数据集和自构建的MUT数据集进行了全面的仿真验证。大量的实验结果压倒性地证实了所提出的方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Computational Social Systems
IEEE Transactions on Computational Social Systems Social Sciences-Social Sciences (miscellaneous)
CiteScore
10.00
自引率
20.00%
发文量
316
期刊介绍: IEEE Transactions on Computational Social Systems focuses on such topics as modeling, simulation, analysis and understanding of social systems from the quantitative and/or computational perspective. "Systems" include man-man, man-machine and machine-machine organizations and adversarial situations as well as social media structures and their dynamics. More specifically, the proposed transactions publishes articles on modeling the dynamics of social systems, methodologies for incorporating and representing socio-cultural and behavioral aspects in computational modeling, analysis of social system behavior and structure, and paradigms for social systems modeling and simulation. The journal also features articles on social network dynamics, social intelligence and cognition, social systems design and architectures, socio-cultural modeling and representation, and computational behavior modeling, and their applications.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信