Mobile texting: can post-ASR correction solve the issues? an experimental study on gain vs. costs

IUI. International Conference on Intelligent User Interfaces Pub Date : 2012-02-14 DOI:10.1145/2166966.2166974

M. Feld, S. Momtazi, F. Freigang, D. Klakow, Christian A. Müller

引用次数: 12

Abstract

The next big step in embedded, mobile speech recognition will be to allow completely free input as it is needed for messaging like SMS or email. However, unconstrained dictation remains error-prone, especially when the environment is noisy. In this paper, we compare different methods for improving a given free-text dictation system used to enter textbased messages in embedded mobile scenarios, where distraction, interaction cost, and hardware limitations enforce strict constraints over traditional scenarios. We present a corpus-based evaluation, measuring the trade-off between improvement of the word error rate versus the interaction steps that are required under various parameters. Results show that by post-processing the output of a "black box" speech recognizer (e.g. a web-based speech recognition service), a reduction of word error rate by 55% (10.3% abs.) can be obtained. For further error reduction, however, a richer representation of the original hypotheses (e.g. lattice) is necessary.

查看原文本刊更多论文

手机短信:asr后的修正能解决问题吗?收益与成本的实验研究

嵌入式移动语音识别的下一个重要步骤将是允许完全自由的输入，因为它需要像短信或电子邮件这样的消息传递。然而，不受约束的听写仍然容易出错，尤其是在环境嘈杂的情况下。在本文中，我们比较了不同的方法来改进给定的自由文本听写系统，该系统用于在嵌入式移动场景中输入基于文本的消息，其中干扰、交互成本和硬件限制比传统场景强制执行严格的约束。我们提出了一个基于语料库的评估，衡量在不同参数下，单词错误率的改善与所需的交互步骤之间的权衡。结果表明，通过对“黑盒”语音识别器(例如基于web的语音识别服务)的输出进行后处理，可以将单词错误率降低55% (10.3% abs.)。然而，为了进一步减小误差，原始假设的更丰富的表示(例如格)是必要的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IUI. International Conference on Intelligent User Interfaces

自引率

0.00%

发文量