Incorporating syllable duration into line-detection-based spoken term detection

2012 IEEE Spoken Language Technology Workshop (SLT) Pub Date : 2012-12-01 DOI:10.1109/SLT.2012.6424223

Teppei Ohno, T. Akiba

引用次数: 2

Abstract

A conventional method for spoken term detection (STD) is to apply approximate string matching to subword sequences in a spoken document obtained by speech recognition. An STD method that considers string matching as line detection in a syllable distance plane has been proposed. While this has demonstrated fast ordered-by-distance detections, it has still suffered from the insertion and deletion errors introduced by the speech recognition. In this work, we aim to improve detection performance by employing syllable-duration information. The proposed method enables robust detection by introducing a distance plane that uses frames as units instead of using syllables as units. Our experimental evaluation showed that the incorporation of syllable-duration achieved higher detection performance in high-recall regions.

查看原文本刊更多论文

将音节长度纳入基于行检测的口语术语检测

传统的语音词检测方法是对语音识别得到的语音文档中的子词序列进行近似字符串匹配。提出了一种在音节距离平面上将字符串匹配作为行检测的STD方法。虽然这证明了快速的按距离排序检测，但它仍然受到语音识别引入的插入和删除错误的影响。在这项工作中，我们的目标是通过使用音节长度信息来提高检测性能。提出的方法通过引入以帧为单位而不是以音节为单位的距离平面来实现鲁棒检测。实验结果表明，结合音节时长的方法在高召回区域具有较高的检测性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2012 IEEE Spoken Language Technology Workshop (SLT)

自引率

0.00%

发文量