Multimodal structure segmentation and analysis of music using audio and textual information

2009 IEEE International Symposium on Circuits and Systems Pub Date : 2009-05-24 DOI:10.1109/ISCAS.2009.5118096

Heng Tze Cheng, Yi-Hsuan Yang, Yu-Ching Lin, Homer H. Chen

引用次数: 22

Abstract

In this paper, we present a multimodal approach to structure segmentation of music with applications to audio content analysis and music information retrieval. In particular, since lyrics contain rich information about the semantic structure of a song, our approach incorporates lyrics to overcome the existing difficulties associated with large acoustic variation in music. We further design a constrained clustering algorithm for music segmentation and evaluate its performance on commercial recordings. Experimental results show that our method can effectively detect the boundaries and the types of semantic structure of music segments.

查看原文本刊更多论文

基于音频和文本信息的音乐多模态结构分割与分析

本文提出了一种多模态音乐结构分割方法，并将其应用于音频内容分析和音乐信息检索。特别是，由于歌词包含了关于歌曲语义结构的丰富信息，我们的方法结合歌词来克服与音乐中巨大的声学变化相关的现有困难。我们进一步设计了一种约束聚类算法用于音乐分割，并评估了其在商业录音上的性能。实验结果表明，该方法可以有效地检测音乐片段的语义结构边界和类型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2009 IEEE International Symposium on Circuits and Systems

自引率

0.00%

发文量