Improved semantic retrieval of spoken content by language models enhanced with acoustic similarity graph

Hung-yi Lee, Tsung-Hsien Wen, Lin-Shan Lee
{"title":"Improved semantic retrieval of spoken content by language models enhanced with acoustic similarity graph","authors":"Hung-yi Lee, Tsung-Hsien Wen, Lin-Shan Lee","doi":"10.1109/SLT.2012.6424219","DOIUrl":null,"url":null,"abstract":"Retrieving objects semantically related to the query has been widely studied in text information retrieval. However, when applying the text-based techniques on spoken content, the inevitable recognition errors may seriously degrade the performance. In this paper, we propose to enhance the expected term frequencies estimated from spoken content by acoustic similarity graphs. For each word in the lexicon, a graph is constructed describing acoustic similarity among spoken segments in the archive. Score propagation over the graph helps in estimating the expected term frequencies. The enhanced expected term frequencies can be used in the language modeling retrieval approach, as well as semantic retrieval techniques such as the document expansion based on latent semantic analysis, and query expansion considering both words and latent topic information. Preliminary experiments performed on Mandarin broadcast news indicated that improved performance were achievable under different conditions.","PeriodicalId":375378,"journal":{"name":"2012 IEEE Spoken Language Technology Workshop (SLT)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE Spoken Language Technology Workshop (SLT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SLT.2012.6424219","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

Abstract

Retrieving objects semantically related to the query has been widely studied in text information retrieval. However, when applying the text-based techniques on spoken content, the inevitable recognition errors may seriously degrade the performance. In this paper, we propose to enhance the expected term frequencies estimated from spoken content by acoustic similarity graphs. For each word in the lexicon, a graph is constructed describing acoustic similarity among spoken segments in the archive. Score propagation over the graph helps in estimating the expected term frequencies. The enhanced expected term frequencies can be used in the language modeling retrieval approach, as well as semantic retrieval techniques such as the document expansion based on latent semantic analysis, and query expansion considering both words and latent topic information. Preliminary experiments performed on Mandarin broadcast news indicated that improved performance were achievable under different conditions.
声学相似图增强语言模型对口语内容语义检索的改进
在文本信息检索中,检索与查询语义相关的对象已经得到了广泛的研究。然而,在将基于文本的识别技术应用于口语内容时,不可避免的识别错误会严重降低识别性能。在本文中,我们提出通过声学相似图来提高从语音内容估计的期望词频率。对于词典中的每个单词,构建一个图来描述档案中语音片段之间的声学相似性。在图上的分数传播有助于估计预期的项频率。增强的期望词频率可用于语言建模检索方法,以及基于潜在语义分析的文档扩展、同时考虑词和潜在主题信息的查询扩展等语义检索技术。对普通话广播新闻进行的初步实验表明,在不同的条件下,性能都可以得到提高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信