Mobile texting: can post-ASR correction solve the issues? an experimental study on gain vs. costs

M. Feld, S. Momtazi, F. Freigang, D. Klakow, Christian A. Müller
{"title":"Mobile texting: can post-ASR correction solve the issues? an experimental study on gain vs. costs","authors":"M. Feld, S. Momtazi, F. Freigang, D. Klakow, Christian A. Müller","doi":"10.1145/2166966.2166974","DOIUrl":null,"url":null,"abstract":"The next big step in embedded, mobile speech recognition will be to allow completely free input as it is needed for messaging like SMS or email. However, unconstrained dictation remains error-prone, especially when the environment is noisy. In this paper, we compare different methods for improving a given free-text dictation system used to enter textbased messages in embedded mobile scenarios, where distraction, interaction cost, and hardware limitations enforce strict constraints over traditional scenarios. We present a corpus-based evaluation, measuring the trade-off between improvement of the word error rate versus the interaction steps that are required under various parameters. Results show that by post-processing the output of a \"black box\" speech recognizer (e.g. a web-based speech recognition service), a reduction of word error rate by 55% (10.3% abs.) can be obtained. For further error reduction, however, a richer representation of the original hypotheses (e.g. lattice) is necessary.","PeriodicalId":87287,"journal":{"name":"IUI. International Conference on Intelligent User Interfaces","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2012-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IUI. International Conference on Intelligent User Interfaces","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2166966.2166974","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

The next big step in embedded, mobile speech recognition will be to allow completely free input as it is needed for messaging like SMS or email. However, unconstrained dictation remains error-prone, especially when the environment is noisy. In this paper, we compare different methods for improving a given free-text dictation system used to enter textbased messages in embedded mobile scenarios, where distraction, interaction cost, and hardware limitations enforce strict constraints over traditional scenarios. We present a corpus-based evaluation, measuring the trade-off between improvement of the word error rate versus the interaction steps that are required under various parameters. Results show that by post-processing the output of a "black box" speech recognizer (e.g. a web-based speech recognition service), a reduction of word error rate by 55% (10.3% abs.) can be obtained. For further error reduction, however, a richer representation of the original hypotheses (e.g. lattice) is necessary.
手机短信:asr后的修正能解决问题吗?收益与成本的实验研究
嵌入式移动语音识别的下一个重要步骤将是允许完全自由的输入,因为它需要像短信或电子邮件这样的消息传递。然而,不受约束的听写仍然容易出错,尤其是在环境嘈杂的情况下。在本文中,我们比较了不同的方法来改进给定的自由文本听写系统,该系统用于在嵌入式移动场景中输入基于文本的消息,其中干扰、交互成本和硬件限制比传统场景强制执行严格的约束。我们提出了一个基于语料库的评估,衡量在不同参数下,单词错误率的改善与所需的交互步骤之间的权衡。结果表明,通过对“黑盒”语音识别器(例如基于web的语音识别服务)的输出进行后处理,可以将单词错误率降低55% (10.3% abs.)。然而,为了进一步减小误差,原始假设的更丰富的表示(例如格)是必要的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信