Spoken term detection based on the most probable phoneme sequence

G. Gosztolya, L. Tóth
{"title":"Spoken term detection based on the most probable phoneme sequence","authors":"G. Gosztolya, L. Tóth","doi":"10.1109/SAMI.2011.5738856","DOIUrl":null,"url":null,"abstract":"The aim of the spoken term detection task is to find the occurrence of user-entered keywords in an archive of audio recordings. In this area, besides the accuracy of hits returned, the speed of search is also very important, for which an intermediate representation of recordings is normally used. In this paper we evaluate a spoken term detection method which represents the speech signals by their most probable phoneme sequence, on which a dynamic search is then performed. As the accuracy of the phoneme recognizer used is vital, we shall test this method by using several approaches of phoneme identification. We found that our method already achieves satisfactory accuracy, although its run time is still rather high. We also found that this approach is heavily dependent on the performance of the phoneme recognizer.","PeriodicalId":202398,"journal":{"name":"2011 IEEE 9th International Symposium on Applied Machine Intelligence and Informatics (SAMI)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE 9th International Symposium on Applied Machine Intelligence and Informatics (SAMI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SAMI.2011.5738856","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

The aim of the spoken term detection task is to find the occurrence of user-entered keywords in an archive of audio recordings. In this area, besides the accuracy of hits returned, the speed of search is also very important, for which an intermediate representation of recordings is normally used. In this paper we evaluate a spoken term detection method which represents the speech signals by their most probable phoneme sequence, on which a dynamic search is then performed. As the accuracy of the phoneme recognizer used is vital, we shall test this method by using several approaches of phoneme identification. We found that our method already achieves satisfactory accuracy, although its run time is still rather high. We also found that this approach is heavily dependent on the performance of the phoneme recognizer.
基于最可能音素序列的口语术语检测
口语词检测任务的目的是在录音档案中查找用户输入的关键字的出现情况。在这个领域,除了搜索结果的准确性之外,搜索的速度也非常重要,因此通常使用记录的中间表示。在本文中,我们评估了一种语音术语检测方法,该方法通过语音信号最可能的音素序列来表示语音信号,然后对其进行动态搜索。由于所使用的音素识别器的准确性至关重要,我们将通过使用几种音素识别方法来测试该方法。我们发现我们的方法已经达到了令人满意的精度,尽管它的运行时间仍然很高。我们还发现,这种方法在很大程度上依赖于音素识别器的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信