Query by example search with segmented dynamic time warping for non-exact spoken queries

2015 23rd European Signal Processing Conference (EUSIPCO) Pub Date : 2015-12-28 DOI:10.1109/EUSIPCO.2015.7362666

Jorge Proença, A. Veiga, F. Perdigão

引用次数: 19

Abstract

This paper presents an approach to the Query-by-Example task of finding spoken queries on speech databases when the intended match may be non-exact or slightly complex. The built system is low-resource as it tries to solve the problem where the language of queries and searched audio is unspecified. Our method is based on a modified Dynamic Time Warping (DTW) algorithm using posterior-grams and extracting intricate paths to account for special cases of query match such as word re-ordering, lexical variations and filler content. This system was evaluated on the MediaEval 2014 task of Query by Example Search on Speech (QUESST) where the spoken data is from different languages, unknown to the participant. We combined the results of five DTW modifications computed on the output of three phoneme recognizers of different languages. The combination of all systems provided the best performance overall and improved detection of complex case queries.

查看原文本刊更多论文

针对非精确语音查询，使用分段动态时间扭曲的示例搜索查询

本文提出了一种基于实例的查询任务的方法，当期望的匹配可能不精确或稍微复杂时，在语音数据库中查找语音查询。构建的系统是低资源的，因为它试图解决查询语言和搜索音频未指定的问题。我们的方法是基于一种改进的动态时间扭曲(DTW)算法，使用后验图和提取复杂的路径来解释查询匹配的特殊情况，如单词重新排序、词汇变化和填充内容。该系统在MediaEval 2014的语音查询示例搜索(QUESST)任务上进行了评估，其中语音数据来自不同的语言，参与者不知道。我们结合了在三种不同语言的音素识别器的输出上计算的五种DTW修改的结果。所有系统的组合提供了最佳的总体性能，并改进了对复杂情况查询的检测。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2015 23rd European Signal Processing Conference (EUSIPCO)

自引率

0.00%

发文量