Query by example search with segmented dynamic time warping for non-exact spoken queries

Jorge Proença, A. Veiga, F. Perdigão
{"title":"Query by example search with segmented dynamic time warping for non-exact spoken queries","authors":"Jorge Proença, A. Veiga, F. Perdigão","doi":"10.1109/EUSIPCO.2015.7362666","DOIUrl":null,"url":null,"abstract":"This paper presents an approach to the Query-by-Example task of finding spoken queries on speech databases when the intended match may be non-exact or slightly complex. The built system is low-resource as it tries to solve the problem where the language of queries and searched audio is unspecified. Our method is based on a modified Dynamic Time Warping (DTW) algorithm using posterior-grams and extracting intricate paths to account for special cases of query match such as word re-ordering, lexical variations and filler content. This system was evaluated on the MediaEval 2014 task of Query by Example Search on Speech (QUESST) where the spoken data is from different languages, unknown to the participant. We combined the results of five DTW modifications computed on the output of three phoneme recognizers of different languages. The combination of all systems provided the best performance overall and improved detection of complex case queries.","PeriodicalId":401040,"journal":{"name":"2015 23rd European Signal Processing Conference (EUSIPCO)","volume":"499 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 23rd European Signal Processing Conference (EUSIPCO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EUSIPCO.2015.7362666","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19

Abstract

This paper presents an approach to the Query-by-Example task of finding spoken queries on speech databases when the intended match may be non-exact or slightly complex. The built system is low-resource as it tries to solve the problem where the language of queries and searched audio is unspecified. Our method is based on a modified Dynamic Time Warping (DTW) algorithm using posterior-grams and extracting intricate paths to account for special cases of query match such as word re-ordering, lexical variations and filler content. This system was evaluated on the MediaEval 2014 task of Query by Example Search on Speech (QUESST) where the spoken data is from different languages, unknown to the participant. We combined the results of five DTW modifications computed on the output of three phoneme recognizers of different languages. The combination of all systems provided the best performance overall and improved detection of complex case queries.
针对非精确语音查询,使用分段动态时间扭曲的示例搜索查询
本文提出了一种基于实例的查询任务的方法,当期望的匹配可能不精确或稍微复杂时,在语音数据库中查找语音查询。构建的系统是低资源的,因为它试图解决查询语言和搜索音频未指定的问题。我们的方法是基于一种改进的动态时间扭曲(DTW)算法,使用后验图和提取复杂的路径来解释查询匹配的特殊情况,如单词重新排序、词汇变化和填充内容。该系统在MediaEval 2014的语音查询示例搜索(QUESST)任务上进行了评估,其中语音数据来自不同的语言,参与者不知道。我们结合了在三种不同语言的音素识别器的输出上计算的五种DTW修改的结果。所有系统的组合提供了最佳的总体性能,并改进了对复杂情况查询的检测。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信