不完全语音的查询扩展:在分布式学习中的应用

2000 Proceedings Workshop on Content-based Access of Image and Video Libraries Pub Date : 2000-06-16 DOI:10.1109/IVL.2000.853839

S. Srinivasan, D. Ponceleón, D. Petkovic, M. Viswanathan

{"title":"不完全语音的查询扩展:在分布式学习中的应用","authors":"S. Srinivasan, D. Ponceleón, D. Petkovic, M. Viswanathan","doi":"10.1109/IVL.2000.853839","DOIUrl":null,"url":null,"abstract":"Advances in speech recognition technology have shown encouraging results for spoken document retrieval where the average precision often approaches 70% of that achieved for perfect text transcriptions. Typical applications of spoken document retrieval pertain to retrieval of stories from archived video/audio assets. In the CueVideo project, our application focus is spoken document retrieval from a video database for just-in-time training/distributed learning. Typical content is not pre-segmented, has no predefined structure, is of varying audio quality, and may not have domain specific data available. For such content, we propose a two level search, namely, a first level search across the entire video collection, and a second level search within a specific video. At both search levels, we perform an experimental evaluation of a combination of new and existing query expansion methods, intended to offset retrieval errors due to misrecognition.","PeriodicalId":333664,"journal":{"name":"2000 Proceedings Workshop on Content-based Access of Image and Video Libraries","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2000-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Query expansion for imperfect speech: applications in distributed learning\",\"authors\":\"S. Srinivasan, D. Ponceleón, D. Petkovic, M. Viswanathan\",\"doi\":\"10.1109/IVL.2000.853839\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Advances in speech recognition technology have shown encouraging results for spoken document retrieval where the average precision often approaches 70% of that achieved for perfect text transcriptions. Typical applications of spoken document retrieval pertain to retrieval of stories from archived video/audio assets. In the CueVideo project, our application focus is spoken document retrieval from a video database for just-in-time training/distributed learning. Typical content is not pre-segmented, has no predefined structure, is of varying audio quality, and may not have domain specific data available. For such content, we propose a two level search, namely, a first level search across the entire video collection, and a second level search within a specific video. At both search levels, we perform an experimental evaluation of a combination of new and existing query expansion methods, intended to offset retrieval errors due to misrecognition.\",\"PeriodicalId\":333664,\"journal\":{\"name\":\"2000 Proceedings Workshop on Content-based Access of Image and Video Libraries\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2000-06-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2000 Proceedings Workshop on Content-based Access of Image and Video Libraries\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IVL.2000.853839\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2000 Proceedings Workshop on Content-based Access of Image and Video Libraries","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IVL.2000.853839","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 12

摘要

语音识别技术的进步在语音文档检索方面显示出令人鼓舞的结果，其平均精度通常接近完美文本转录所达到的70%。语音文档检索的典型应用涉及从存档的视频/音频资产中检索故事。在CueVideo项目中，我们的应用重点是从视频数据库中检索语音文档，用于实时培训/分布式学习。典型的内容不是预先分割的，没有预定义的结构，音频质量不同，可能没有特定领域的数据可用。对于这样的内容，我们建议进行两级搜索，即第一级搜索整个视频集合，第二级搜索特定视频。在两个搜索级别上，我们对新的和现有的查询扩展方法的组合进行了实验评估，旨在抵消由于错误识别引起的检索错误。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Query expansion for imperfect speech: applications in distributed learning

Advances in speech recognition technology have shown encouraging results for spoken document retrieval where the average precision often approaches 70% of that achieved for perfect text transcriptions. Typical applications of spoken document retrieval pertain to retrieval of stories from archived video/audio assets. In the CueVideo project, our application focus is spoken document retrieval from a video database for just-in-time training/distributed learning. Typical content is not pre-segmented, has no predefined structure, is of varying audio quality, and may not have domain specific data available. For such content, we propose a two level search, namely, a first level search across the entire video collection, and a second level search within a specific video. At both search levels, we perform an experimental evaluation of a combination of new and existing query expansion methods, intended to offset retrieval errors due to misrecognition.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2000 Proceedings Workshop on Content-based Access of Image and Video Libraries

自引率

0.00%

发文量