ACM HotMobile 2013 demo: NLify: mobile spoken natural language interfaces for everyone

Seungyeop Han, Matthai Philipose, Y. Ju
{"title":"ACM HotMobile 2013 demo: NLify: mobile spoken natural language interfaces for everyone","authors":"Seungyeop Han, Matthai Philipose, Y. Ju","doi":"10.1145/2542095.2542097","DOIUrl":null,"url":null,"abstract":"Speech has become an attractive means for interacting with the phone. When speech-enabled interactions are few, keyword-based interfaces [1] that require users to remember precise invocations are adequate. As the number of such interactions increases, users are more likely to forget keywords, and spoken natural language (SNL) interfaces that allow users to express their functional intent without conforming to a rigid syntax become desirable. Prominent “first-party” systems such as Siri and Google Voice Search offer such functionality on select domains today. In this demo, we present a system, NLify, which enables any (“third-party”) developer to add an SNL interface to their application. The key challenge behind the system is that there exists much variability even for a simple command. Worse, noise in speech recognition introduces additional variability. To address this challenge, we use webscale crowdsourcing and automated statistical machine paraphrasing to aid developers to cover much of the possible input space. In addition, we use a statistical language model [2] instead of deterministic one to further handle variability as it provides more tolerance against missing or reordered words. Figure 2 illustrates the overall architecture of NLify. NLify is fully integrated into the Windows Phone 8 development process in the form of a Visual Studio extension whose snapshot is presented in Figure 1. And a quantitative evaluation shows that NLify achieves overall recognition rates of 85% across intents.","PeriodicalId":43578,"journal":{"name":"Mobile Computing and Communications Review","volume":"1 1","pages":"2"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mobile Computing and Communications Review","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2542095.2542097","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Speech has become an attractive means for interacting with the phone. When speech-enabled interactions are few, keyword-based interfaces [1] that require users to remember precise invocations are adequate. As the number of such interactions increases, users are more likely to forget keywords, and spoken natural language (SNL) interfaces that allow users to express their functional intent without conforming to a rigid syntax become desirable. Prominent “first-party” systems such as Siri and Google Voice Search offer such functionality on select domains today. In this demo, we present a system, NLify, which enables any (“third-party”) developer to add an SNL interface to their application. The key challenge behind the system is that there exists much variability even for a simple command. Worse, noise in speech recognition introduces additional variability. To address this challenge, we use webscale crowdsourcing and automated statistical machine paraphrasing to aid developers to cover much of the possible input space. In addition, we use a statistical language model [2] instead of deterministic one to further handle variability as it provides more tolerance against missing or reordered words. Figure 2 illustrates the overall architecture of NLify. NLify is fully integrated into the Windows Phone 8 development process in the form of a Visual Studio extension whose snapshot is presented in Figure 1. And a quantitative evaluation shows that NLify achieves overall recognition rates of 85% across intents.
ACM HotMobile 2013演示:NLify:面向所有人的移动语音自然语言界面
语音已经成为与手机互动的一种有吸引力的方式。当支持语音的交互很少时,需要用户记住精确调用的基于关键字的界面[1]就足够了。随着此类交互数量的增加,用户更有可能忘记关键字,而允许用户在不遵守严格语法的情况下表达其功能意图的口头自然语言(SNL)接口变得很有必要。著名的“第一方”系统,如Siri和谷歌语音搜索,今天在某些领域提供了这样的功能。在这个演示中,我们展示了一个系统NLify,它允许任何(“第三方”)开发人员向他们的应用程序添加SNL接口。该系统背后的关键挑战是,即使是一个简单的命令也存在很大的可变性。更糟糕的是,语音识别中的噪声引入了额外的可变性。为了应对这一挑战,我们使用webscale众包和自动统计机器释义来帮助开发人员覆盖大部分可能的输入空间。此外,我们使用统计语言模型[2]而不是确定性语言模型来进一步处理可变性,因为它提供了对缺失或重新排序的单词的更大容忍度。图2展示了NLify的整体架构。NLify以Visual Studio扩展的形式完全集成到Windows Phone 8开发过程中,其快照如图1所示。一项量化评估表明,NLify在不同意图之间的总体识别率达到了85%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信