Yun Tie;Xin Guo;Donghui Zhang;Jiessie Tie;Lin Qi;Yuhang Lu
{"title":"Hybrid Learning Module-Based Transformer for Multitrack Music Generation With Music Theory","authors":"Yun Tie;Xin Guo;Donghui Zhang;Jiessie Tie;Lin Qi;Yuhang Lu","doi":"10.1109/TCSS.2024.3486604","DOIUrl":null,"url":null,"abstract":"In recent years, multitrack music generation has garnered significant attention in both academic and industrial spheres for its versatile utilization of various instruments in collaborative settings. The primary challenge lies in achieving a harmonious balance within individual tracks and fostering effective collaboration across multiple tracks. To address this issue, this article introduces a pioneering hybrid learning encoder architecture. Each music track's encoder is implemented as an independent transformer architecture, preserving self-attention mechanisms within a single track and interattention mechanisms between different tracks. The resulting features are then seamlessly integrated into the decoder through concatenation. Of particular significance, previous multitrack music generation efforts have predominantly operated under unconditional settings, yielding music that lacks practical value due to noncompliance with established music theory principles. Recognizing this limitation, the article proposes a novel approach to multitrack music generation guided by music theory rules. Employing reinforcement learning techniques, the decoder-generated music serves as the initial state. Positive feedback is provided when the generated music adheres to music theory rules; conversely, negative feedback is applied to compel the multitrack music to align with widely accepted music theory principles. Finally, comprehensive simulation validation is conducted on both the publicly available LMD dataset and the self-constructed MUT dataset. The plethora of experimental results overwhelmingly corroborates the efficacy of the proposed methodology.","PeriodicalId":13044,"journal":{"name":"IEEE Transactions on Computational Social Systems","volume":"12 2","pages":"862-872"},"PeriodicalIF":4.5000,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computational Social Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10758307/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, CYBERNETICS","Score":null,"Total":0}
引用次数: 0
Abstract
In recent years, multitrack music generation has garnered significant attention in both academic and industrial spheres for its versatile utilization of various instruments in collaborative settings. The primary challenge lies in achieving a harmonious balance within individual tracks and fostering effective collaboration across multiple tracks. To address this issue, this article introduces a pioneering hybrid learning encoder architecture. Each music track's encoder is implemented as an independent transformer architecture, preserving self-attention mechanisms within a single track and interattention mechanisms between different tracks. The resulting features are then seamlessly integrated into the decoder through concatenation. Of particular significance, previous multitrack music generation efforts have predominantly operated under unconditional settings, yielding music that lacks practical value due to noncompliance with established music theory principles. Recognizing this limitation, the article proposes a novel approach to multitrack music generation guided by music theory rules. Employing reinforcement learning techniques, the decoder-generated music serves as the initial state. Positive feedback is provided when the generated music adheres to music theory rules; conversely, negative feedback is applied to compel the multitrack music to align with widely accepted music theory principles. Finally, comprehensive simulation validation is conducted on both the publicly available LMD dataset and the self-constructed MUT dataset. The plethora of experimental results overwhelmingly corroborates the efficacy of the proposed methodology.
期刊介绍:
IEEE Transactions on Computational Social Systems focuses on such topics as modeling, simulation, analysis and understanding of social systems from the quantitative and/or computational perspective. "Systems" include man-man, man-machine and machine-machine organizations and adversarial situations as well as social media structures and their dynamics. More specifically, the proposed transactions publishes articles on modeling the dynamics of social systems, methodologies for incorporating and representing socio-cultural and behavioral aspects in computational modeling, analysis of social system behavior and structure, and paradigms for social systems modeling and simulation. The journal also features articles on social network dynamics, social intelligence and cognition, social systems design and architectures, socio-cultural modeling and representation, and computational behavior modeling, and their applications.