Spoken term detection based on the most probable phoneme sequence

2011 IEEE 9th International Symposium on Applied Machine Intelligence and Informatics (SAMI) Pub Date : 2011-03-24 DOI:10.1109/SAMI.2011.5738856

G. Gosztolya, L. Tóth

引用次数: 9

Abstract

The aim of the spoken term detection task is to find the occurrence of user-entered keywords in an archive of audio recordings. In this area, besides the accuracy of hits returned, the speed of search is also very important, for which an intermediate representation of recordings is normally used. In this paper we evaluate a spoken term detection method which represents the speech signals by their most probable phoneme sequence, on which a dynamic search is then performed. As the accuracy of the phoneme recognizer used is vital, we shall test this method by using several approaches of phoneme identification. We found that our method already achieves satisfactory accuracy, although its run time is still rather high. We also found that this approach is heavily dependent on the performance of the phoneme recognizer.

查看原文本刊更多论文

基于最可能音素序列的口语术语检测

口语词检测任务的目的是在录音档案中查找用户输入的关键字的出现情况。在这个领域，除了搜索结果的准确性之外，搜索的速度也非常重要，因此通常使用记录的中间表示。在本文中，我们评估了一种语音术语检测方法，该方法通过语音信号最可能的音素序列来表示语音信号，然后对其进行动态搜索。由于所使用的音素识别器的准确性至关重要，我们将通过使用几种音素识别方法来测试该方法。我们发现我们的方法已经达到了令人满意的精度，尽管它的运行时间仍然很高。我们还发现，这种方法在很大程度上依赖于音素识别器的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2011 IEEE 9th International Symposium on Applied Machine Intelligence and Informatics (SAMI)

自引率

0.00%

发文量