Comparing MFCC and MPEG-7 audio features for feature extraction, maximum likelihood HMM and entropic prior HMM for sports audio classification

2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). Pub Date : 2003-07-06 DOI:10.1109/ICASSP.2003.1200048

Ziyou Xiong, R. Radhakrishnan, Ajay Divakaran, Thomas S. Huang

引用次数: 35

Abstract

We present a comparison of 6 methods for classification of sports audio. For feature extraction, we have two choices: MPEG-7 audio features and Mel-scale frequency cepstrum coefficients (MFCC). For classification, we also have two choices: maximum likelihood hidden Markov models (ML-HMM) and entropic prior HMMs (EP-HMM). EP-HMMs, in turn, have two variations: with and without trimming of the model parameters. We thus have 6 possible methods, each of which corresponds to a combination. Our results show that all the combinations achieve classification accuracy of around 90% with the best and the second best being, respectively, MPEG-7 features with EP-HMM and MFCC with ML-HMM.

查看原文本刊更多论文

比较MFCC和MPEG-7音频特征提取、最大似然HMM和熵先验HMM对运动音频分类的影响

本文对6种运动音频分类方法进行了比较。对于特征提取，我们有两种选择:MPEG-7音频特征和mel尺度频率倒频谱系数(MFCC)。对于分类，我们也有两种选择:最大似然隐马尔可夫模型(ML-HMM)和熵先验隐马尔可夫模型(EP-HMM)。反过来，ep - hmm有两种变化:有和没有修剪模型参数。因此，我们有6种可能的方法，每种方法对应于一个组合。我们的结果表明，所有组合的分类准确率都在90%左右，其中最好的和次好的分别是MPEG-7特征与EP-HMM和MFCC特征与ML-HMM。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).

自引率

0.00%

发文量