Harmonic Blind Sound Source Isolation Enhanced by Spectrum Clustering

2008 IEEE International Conference on Data Mining Workshops Pub Date : 2008-12-15 DOI:10.1109/ICDMW.2008.67

Cynthia Xin Zhang, Wenxin Jiang, Z. Ras

{"title":"Harmonic Blind Sound Source Isolation Enhanced by Spectrum Clustering","authors":"Cynthia Xin Zhang, Wenxin Jiang, Z. Ras","doi":"10.1109/ICDMW.2008.67","DOIUrl":null,"url":null,"abstract":"Automatic indexing of music by instruments and their types is a challenging problem, especially when multiple instruments are playing at the same time. We have built a database containing more than one million of music instrument sounds, each described by a large number o features including standard MPEG7 audio descriptors, features for speech recognition, and many new audio features developed by our team. Our previous research results show that all these features only lead to classifiers which successfully identify music instruments in monophonic music (only one instrument playing at a time). Their confidence for polyphonic music is much lower. This brought the need for blind sound source separation algorithms. In this paper, we present a new spectrum clustering enhanced method which improves the estimation of fundamental frequency as well as the balance of the categorization tree of training datasets, and therefore enhances the precision of automatic indexing. The system is recursively detecting the pitch of the predominant sound source, then calculates the features based on the estimated pitch, and then predicts the most similar spectrum by the corresponding classification tree, and finally subtracts the estimated predominant spectrum until silence is detected.","PeriodicalId":175955,"journal":{"name":"2008 IEEE International Conference on Data Mining Workshops","volume":"56 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE International Conference on Data Mining Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW.2008.67","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Automatic indexing of music by instruments and their types is a challenging problem, especially when multiple instruments are playing at the same time. We have built a database containing more than one million of music instrument sounds, each described by a large number o features including standard MPEG7 audio descriptors, features for speech recognition, and many new audio features developed by our team. Our previous research results show that all these features only lead to classifiers which successfully identify music instruments in monophonic music (only one instrument playing at a time). Their confidence for polyphonic music is much lower. This brought the need for blind sound source separation algorithms. In this paper, we present a new spectrum clustering enhanced method which improves the estimation of fundamental frequency as well as the balance of the categorization tree of training datasets, and therefore enhances the precision of automatic indexing. The system is recursively detecting the pitch of the predominant sound source, then calculates the features based on the estimated pitch, and then predicts the most similar spectrum by the corresponding classification tree, and finally subtracts the estimated predominant spectrum until silence is detected.

查看原文本刊更多论文

频谱聚类增强谐波盲声源隔离

根据乐器及其类型自动索引音乐是一个具有挑战性的问题，特别是当多个乐器同时演奏时。我们已经建立了一个包含超过一百万种乐器声音的数据库，每个声音都由大量的特征描述，包括标准的MPEG7音频描述符，语音识别特征以及我们团队开发的许多新音频特征。我们之前的研究结果表明，所有这些特征只会导致分类器成功识别单音音乐中的乐器(一次只有一种乐器演奏)。他们对复调音乐的信心要低得多。这带来了对盲声源分离算法的需求。本文提出了一种新的频谱聚类增强方法，改进了训练数据集的基频估计和分类树的平衡性，从而提高了自动标引的精度。该系统递归检测主声源的音高，然后根据估计的音高计算特征，然后通过相应的分类树预测最相似的谱，最后减去估计的主谱，直到检测到静音。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2008 IEEE International Conference on Data Mining Workshops

自引率

0.00%

发文量