Nonexclusive audio segmentation and indexing as a pre-processor for audio information mining

2013 6th International Congress on Image and Signal Processing (CISP) Pub Date : 2013-12-01 DOI:10.1109/CISP.2013.6743930

Francis F. Li

引用次数: 4

Abstract

Much content related information can be extracted from recorded soundtracks, such as those of multimedia files. The soundtracks might be heuristically classified into three categories namely speech, music and ambient or event sounds. Research in the past focused on algorithms to classify audio clips in an exclusive manner. However, soundtracks from media content are often presented as overlapped mixtures of all these three types of sounds. Nonexclusive segmentation and indexing are therefore essential pre-processors for effective audio information mining and metadata generation. This paper emphasizes the importance of nonexclusive indexing and segmentation methods, identifies the challenges and proposes a universal architecture for nonexclusive segmentation and indexing as a pre-processor for audio information mining, metadata extraction and scene analysis. Related feature selection, pattern recognition and signal processing algorithms are presented and testing results discussed.

查看原文本刊更多论文

非排他性音频分割和索引作为音频信息挖掘的预处理

许多与内容相关的信息可以从录制的音轨中提取出来，例如多媒体文件的音轨。原声可以分为三类，即语音、音乐和环境或事件声音。过去的研究主要集中在以排他性的方式对音频片段进行分类的算法上。然而，来自媒体内容的音轨通常呈现为这三种类型声音的重叠混合物。因此，非排他分割和索引是有效的音频信息挖掘和元数据生成必不可少的预处理。本文强调了非排他性索引和分词方法的重要性，指出了存在的问题，提出了一种通用的非排他性索引和分词体系结构，作为音频信息挖掘、元数据提取和场景分析的预处理。给出了相关的特征选择、模式识别和信号处理算法，并讨论了测试结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2013 6th International Congress on Image and Signal Processing (CISP)

自引率

0.00%

发文量