符号民乐的分词改进：一种具有局部语境和全局结构意识的混合模式。

IF 2.1 3区物理与天体物理 Q2 PHYSICS, MULTIDISCIPLINARY

Entropy Pub Date : 2025-04-24 DOI:10.3390/e27050460

Xin Guan, Zhilin Dong, Hui Liu, Qiang Li

{"title":"符号民乐的分词改进：一种具有局部语境和全局结构意识的混合模式。","authors":"Xin Guan, Zhilin Dong, Hui Liu, Qiang Li","doi":"10.3390/e27050460","DOIUrl":null,"url":null,"abstract":"The segmentation of symbolic music phrases is crucial for music information retrieval and structural analysis. However, existing BiLSTM-CRF methods mainly rely on local semantics, making it difficult to capture long-range dependencies, leading to inaccurate phrase boundary recognition across measures or themes. Traditional Transformer models use static embeddings, limiting their adaptability to different musical styles, structures, and melodic evolutions. Moreover, multi-head self-attention struggles with local context modeling, causing the loss of short-term information (e.g., pitch variation, melodic integrity, and rhythm stability), which may result in over-segmentation or merging errors. To address these issues, we propose a segmentation method integrating local context enhancement and global structure awareness. This method overcomes traditional models' limitations in long-range dependency modeling, improves phrase boundary recognition, and adapts to diverse musical styles and melodies. Specifically, dynamic note embeddings enhance contextual awareness across segments, while an improved attention mechanism strengthens both global semantics and local context modeling. Combining these strategies ensures reasonable phrase boundaries and prevents unnecessary segmentation or merging. The experimental results show that our method outperforms the state-of-the-art methods for symbolic music phrase segmentation, with phrase boundaries better aligned to musical structures.","PeriodicalId":11694,"journal":{"name":"Entropy","volume":"27 5","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12109989/pdf/","citationCount":"0","resultStr":"{\"title\":\"Improving Phrase Segmentation in Symbolic Folk Music: A Hybrid Model with Local Context and Global Structure Awareness.\",\"authors\":\"Xin Guan, Zhilin Dong, Hui Liu, Qiang Li\",\"doi\":\"10.3390/e27050460\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The segmentation of symbolic music phrases is crucial for music information retrieval and structural analysis. However, existing BiLSTM-CRF methods mainly rely on local semantics, making it difficult to capture long-range dependencies, leading to inaccurate phrase boundary recognition across measures or themes. Traditional Transformer models use static embeddings, limiting their adaptability to different musical styles, structures, and melodic evolutions. Moreover, multi-head self-attention struggles with local context modeling, causing the loss of short-term information (e.g., pitch variation, melodic integrity, and rhythm stability), which may result in over-segmentation or merging errors. To address these issues, we propose a segmentation method integrating local context enhancement and global structure awareness. This method overcomes traditional models' limitations in long-range dependency modeling, improves phrase boundary recognition, and adapts to diverse musical styles and melodies. Specifically, dynamic note embeddings enhance contextual awareness across segments, while an improved attention mechanism strengthens both global semantics and local context modeling. Combining these strategies ensures reasonable phrase boundaries and prevents unnecessary segmentation or merging. The experimental results show that our method outperforms the state-of-the-art methods for symbolic music phrase segmentation, with phrase boundaries better aligned to musical structures.\",\"PeriodicalId\":11694,\"journal\":{\"name\":\"Entropy\",\"volume\":\"27 5\",\"pages\":\"\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2025-04-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12109989/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Entropy\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://doi.org/10.3390/e27050460\",\"RegionNum\":3,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"PHYSICS, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Entropy","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.3390/e27050460","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PHYSICS, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

摘要

符号乐句的分割是音乐信息检索和结构分析的关键。然而，现有的BiLSTM-CRF方法主要依赖于局部语义，难以捕获远程依赖关系，导致跨度量或主题的短语边界识别不准确。传统的Transformer模型使用静态嵌入，限制了它们对不同音乐风格、结构和旋律演变的适应性。此外，多头自我注意与局部上下文建模斗争，导致短期信息（例如，音高变化，旋律完整性和节奏稳定性）的丢失，这可能导致过度分割或合并错误。为了解决这些问题，我们提出了一种结合局部上下文增强和全局结构感知的分割方法。该方法克服了传统模型在远程依赖建模方面的局限性，提高了乐句边界识别能力，适应了不同的音乐风格和旋律。具体来说，动态注释嵌入增强了上下文感知，而改进的注意机制增强了全局语义和局部上下文建模。结合这些策略可以确保合理的短语边界，并防止不必要的分割或合并。实验结果表明，我们的方法优于最先进的符号音乐乐句分割方法，乐句边界更好地与音乐结构对齐。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Improving Phrase Segmentation in Symbolic Folk Music: A Hybrid Model with Local Context and Global Structure Awareness.

The segmentation of symbolic music phrases is crucial for music information retrieval and structural analysis. However, existing BiLSTM-CRF methods mainly rely on local semantics, making it difficult to capture long-range dependencies, leading to inaccurate phrase boundary recognition across measures or themes. Traditional Transformer models use static embeddings, limiting their adaptability to different musical styles, structures, and melodic evolutions. Moreover, multi-head self-attention struggles with local context modeling, causing the loss of short-term information (e.g., pitch variation, melodic integrity, and rhythm stability), which may result in over-segmentation or merging errors. To address these issues, we propose a segmentation method integrating local context enhancement and global structure awareness. This method overcomes traditional models' limitations in long-range dependency modeling, improves phrase boundary recognition, and adapts to diverse musical styles and melodies. Specifically, dynamic note embeddings enhance contextual awareness across segments, while an improved attention mechanism strengthens both global semantics and local context modeling. Combining these strategies ensures reasonable phrase boundaries and prevents unnecessary segmentation or merging. The experimental results show that our method outperforms the state-of-the-art methods for symbolic music phrase segmentation, with phrase boundaries better aligned to musical structures.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Entropy PHYSICS, MULTIDISCIPLINARY-

CiteScore

4.90

自引率

11.10%

发文量

1580

审稿时长

21.05 days

期刊介绍： Entropy (ISSN 1099-4300), an international and interdisciplinary journal of entropy and information studies, publishes reviews, regular research papers and short notes. Our aim is to encourage scientists to publish as much as possible their theoretical and experimental details. There is no restriction on the length of the papers. If there are computation and the experiment, the details must be provided so that the results can be reproduced.