Improved semantic retrieval of spoken content by language models enhanced with acoustic similarity graph

2012 IEEE Spoken Language Technology Workshop (SLT) Pub Date : 2012-12-01 DOI:10.1109/SLT.2012.6424219

Hung-yi Lee, Tsung-Hsien Wen, Lin-Shan Lee

引用次数: 11

Abstract

Retrieving objects semantically related to the query has been widely studied in text information retrieval. However, when applying the text-based techniques on spoken content, the inevitable recognition errors may seriously degrade the performance. In this paper, we propose to enhance the expected term frequencies estimated from spoken content by acoustic similarity graphs. For each word in the lexicon, a graph is constructed describing acoustic similarity among spoken segments in the archive. Score propagation over the graph helps in estimating the expected term frequencies. The enhanced expected term frequencies can be used in the language modeling retrieval approach, as well as semantic retrieval techniques such as the document expansion based on latent semantic analysis, and query expansion considering both words and latent topic information. Preliminary experiments performed on Mandarin broadcast news indicated that improved performance were achievable under different conditions.

查看原文本刊更多论文

声学相似图增强语言模型对口语内容语义检索的改进

在文本信息检索中，检索与查询语义相关的对象已经得到了广泛的研究。然而，在将基于文本的识别技术应用于口语内容时，不可避免的识别错误会严重降低识别性能。在本文中，我们提出通过声学相似图来提高从语音内容估计的期望词频率。对于词典中的每个单词，构建一个图来描述档案中语音片段之间的声学相似性。在图上的分数传播有助于估计预期的项频率。增强的期望词频率可用于语言建模检索方法，以及基于潜在语义分析的文档扩展、同时考虑词和潜在主题信息的查询扩展等语义检索技术。对普通话广播新闻进行的初步实验表明，在不同的条件下，性能都可以得到提高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2012 IEEE Spoken Language Technology Workshop (SLT)

自引率

0.00%

发文量