An N-gram model for unstructured audio signals toward information retrieval

2010 IEEE International Workshop on Multimedia Signal Processing Pub Date : 2010-10-01 DOI:10.1109/MMSP.2010.5662068

Samuel Kim, Shiva Sundaram, P. Georgiou, Shrikanth S. Narayanan

引用次数: 10

Abstract

An N-gram modeling approach for unstructured audio signals is introduced with applications to audio information retrieval. The proposed N-gram approach aims to capture local dynamic information in acoustic words within the acoustic topic model framework which assumes an audio signal consists of latent acoustic topics and each topic can be interpreted as a distribution over acoustic words. Experimental results on classifying audio clips from BBC Sound Effects Library according to both semantic and onomatopoeic labels indicate that the proposed N-gram approach performs better than using only a bag-of-words approach by providing complementary local dynamic information.

查看原文本刊更多论文

面向信息检索的非结构化音频信号n图模型

介绍了一种非结构化音频信号的n图建模方法，并将其应用于音频信息检索。提出的n图方法旨在在声学主题模型框架内捕获声学单词中的局部动态信息，该模型假设音频信号由潜在的声学主题组成，并且每个主题可以被解释为声学单词的分布。根据语义和拟声标签对BBC Sound Effects Library中的音频片段进行分类的实验结果表明，通过提供互补的局部动态信息，所提出的N-gram方法比仅使用词袋方法表现更好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2010 IEEE International Workshop on Multimedia Signal Processing

自引率

0.00%

发文量