Query expansion for imperfect speech: applications in distributed learning

2000 Proceedings Workshop on Content-based Access of Image and Video Libraries Pub Date : 2000-06-16 DOI:10.1109/IVL.2000.853839

S. Srinivasan, D. Ponceleón, D. Petkovic, M. Viswanathan

引用次数: 12

Abstract

Advances in speech recognition technology have shown encouraging results for spoken document retrieval where the average precision often approaches 70% of that achieved for perfect text transcriptions. Typical applications of spoken document retrieval pertain to retrieval of stories from archived video/audio assets. In the CueVideo project, our application focus is spoken document retrieval from a video database for just-in-time training/distributed learning. Typical content is not pre-segmented, has no predefined structure, is of varying audio quality, and may not have domain specific data available. For such content, we propose a two level search, namely, a first level search across the entire video collection, and a second level search within a specific video. At both search levels, we perform an experimental evaluation of a combination of new and existing query expansion methods, intended to offset retrieval errors due to misrecognition.

查看原文本刊更多论文

不完全语音的查询扩展:在分布式学习中的应用

语音识别技术的进步在语音文档检索方面显示出令人鼓舞的结果，其平均精度通常接近完美文本转录所达到的70%。语音文档检索的典型应用涉及从存档的视频/音频资产中检索故事。在CueVideo项目中，我们的应用重点是从视频数据库中检索语音文档，用于实时培训/分布式学习。典型的内容不是预先分割的，没有预定义的结构，音频质量不同，可能没有特定领域的数据可用。对于这样的内容，我们建议进行两级搜索，即第一级搜索整个视频集合，第二级搜索特定视频。在两个搜索级别上，我们对新的和现有的查询扩展方法的组合进行了实验评估，旨在抵消由于错误识别引起的检索错误。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2000 Proceedings Workshop on Content-based Access of Image and Video Libraries

自引率

0.00%

发文量