Joint language models for automatic speech recognition and understanding

2012 IEEE Spoken Language Technology Workshop (SLT) Pub Date : 2012-12-01 DOI:10.1109/SLT.2012.6424222

Ali Orkan Bayer, G. Riccardi

引用次数: 10

Abstract

Language models (LMs) are one of the main knowledge sources used by automatic speech recognition (ASR) and Spoken Language Understanding (SLU) systems. In ASR systems they are optimized to decode words from speech for a transcription task. In SLU systems they are optimized to map words into concept constructs or interpretation representations. Performance optimization is generally designed independently for ASR and SLU models in terms of word accuracy and concept accuracy respectively. However, the best word accuracy performance does not always yield the best understanding performance. In this paper we investigate how LMs originally trained to maximize word accuracy can be parametrized to account for speech understanding constraints and maximize concept accuracy. Incremental reduction in concept error rate is observed when a LM is trained on word-to-concept mappings. We show how to optimize the joint transcription and understanding task performance in the lexical-semantic relation space.

查看原文本刊更多论文

用于自动语音识别和理解的联合语言模型

语言模型(LMs)是自动语音识别(ASR)和口语理解(SLU)系统使用的主要知识来源之一。在ASR系统中，它们被优化为从语音中解码单词以完成转录任务。在SLU系统中，它们被优化为将单词映射为概念结构或解释表示。ASR和SLU模型的性能优化通常分别在词精度和概念精度方面进行独立设计。然而，最好的单词准确性性能并不总是产生最好的理解性能。在本文中，我们研究了如何将最初为最大化单词准确性而训练的LMs参数化，以考虑语音理解约束并最大化概念准确性。当LM在词到概念的映射上进行训练时，可以观察到概念错误率的增量降低。我们展示了如何在词汇-语义关系空间中优化联合转录和理解任务的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2012 IEEE Spoken Language Technology Workshop (SLT)

自引率

0.00%

发文量