Analysis of an MFCC-based audio indexing system for efficient coding of multimedia sources

O. Mubarak, E. Ambikairajah, J. Epps
{"title":"Analysis of an MFCC-based audio indexing system for efficient coding of multimedia sources","authors":"O. Mubarak, E. Ambikairajah, J. Epps","doi":"10.1109/ISSPA.2005.1581014","DOIUrl":null,"url":null,"abstract":"Discrimination between speech and music signals is an important problem in efficient digital radio broadcasting, particularly for variable bit rate applications such as Internet radio. This paper presents a speech/music discrimination system based on a Mel frequency cepstral coefficient (MFCC) front end and a GMM classifier. This system can be used to select the optimum coding scheme for the current frame of an input signal without knowing a priori whether it contains speech-like or music-like characteristics. An analysis of speech and music error rates for different numbers of MFCCs (from 8 to 28) is presented. For the 46 minute evaluation database used in this experiment, an accuracy of up to 97.14% for music and 93.87% for speech can be attained.","PeriodicalId":385337,"journal":{"name":"Proceedings of the Eighth International Symposium on Signal Processing and Its Applications, 2005.","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"35","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Eighth International Symposium on Signal Processing and Its Applications, 2005.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSPA.2005.1581014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 35

Abstract

Discrimination between speech and music signals is an important problem in efficient digital radio broadcasting, particularly for variable bit rate applications such as Internet radio. This paper presents a speech/music discrimination system based on a Mel frequency cepstral coefficient (MFCC) front end and a GMM classifier. This system can be used to select the optimum coding scheme for the current frame of an input signal without knowing a priori whether it contains speech-like or music-like characteristics. An analysis of speech and music error rates for different numbers of MFCCs (from 8 to 28) is presented. For the 46 minute evaluation database used in this experiment, an accuracy of up to 97.14% for music and 93.87% for speech can be attained.
基于mfcc的多媒体资源高效编码音频索引系统分析
语音和音乐信号的区分是有效的数字无线电广播的一个重要问题,特别是对于像互联网广播这样的可变比特率应用。提出了一种基于Mel倒频谱系数(MFCC)前端和GMM分类器的语音/音乐识别系统。该系统可用于选择输入信号当前帧的最佳编码方案,而无需先验地知道它是否包含类语音或类音乐特征。分析了不同数量的mfccc(8 ~ 28)的语音和音乐错误率。对于本实验使用的46分钟评价数据库,音乐的准确率高达97.14%,语音的准确率高达93.87%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信