语音和免提交互:神话、挑战和机遇

Proceedings of the 19th International Conference on Human-Computer Interaction with Mobile Devices and Services Pub Date : 2017-09-04 DOI:10.1145/3098279.3119919

Cosmin Munteanu, Gerald Penn

{"title":"语音和免提交互:神话、挑战和机遇","authors":"Cosmin Munteanu, Gerald Penn","doi":"10.1145/3098279.3119919","DOIUrl":null,"url":null,"abstract":"HCI research has for long been dedicated to better and more naturally facilitating information transfer between humans and machines. Unfortunately, humans' most natural form of communication, speech, is also one of the most difficult modalities to be understood by machines - despite, and perhaps, because it is the highest-bandwidth communication channel we possess. While significant research efforts, from engineering, to linguistic, and to cognitive sciences, have been spent on improving machines' ability to understand speech, the MobileHCI community (and the HCI field at large) has been relatively timid in embracing this modality as a central focus of research. This can be attributed in part to the unexpected variations in error rates when processing speech, in contrast with often-unfounded claims of success from industry, but also to the intrinsic difficulty of designing and especially evaluating speech and natural language interfaces. As such, the development of interactive speech-based systems is mostly driven by engineering efforts to improve such systems with respect to largely arbitrary performance metrics. Such developments have often been void of any user-centered design principles or consideration for usability or usefulness. The goal of this course is to inform the MobileHCI community of the current state of speech and natural language research, to dispel some of the myths surrounding speech-based interaction, as well as to provide an opportunity for researchers and practitioners to learn more about how speech recognition and speech synthesis work, what are their limitations, and how they could be used to enhance current interaction paradigms. Through this, we hope that HCI researchers and practitioners will learn how to combine recent advances in speech processing with user-centred principles in designing more usable and useful speech-based interactive systems.","PeriodicalId":120153,"journal":{"name":"Proceedings of the 19th International Conference on Human-Computer Interaction with Mobile Devices and Services","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Speech and Hands-free interaction: myths, challenges, and opportunities\",\"authors\":\"Cosmin Munteanu, Gerald Penn\",\"doi\":\"10.1145/3098279.3119919\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"HCI research has for long been dedicated to better and more naturally facilitating information transfer between humans and machines. Unfortunately, humans' most natural form of communication, speech, is also one of the most difficult modalities to be understood by machines - despite, and perhaps, because it is the highest-bandwidth communication channel we possess. While significant research efforts, from engineering, to linguistic, and to cognitive sciences, have been spent on improving machines' ability to understand speech, the MobileHCI community (and the HCI field at large) has been relatively timid in embracing this modality as a central focus of research. This can be attributed in part to the unexpected variations in error rates when processing speech, in contrast with often-unfounded claims of success from industry, but also to the intrinsic difficulty of designing and especially evaluating speech and natural language interfaces. As such, the development of interactive speech-based systems is mostly driven by engineering efforts to improve such systems with respect to largely arbitrary performance metrics. Such developments have often been void of any user-centered design principles or consideration for usability or usefulness. The goal of this course is to inform the MobileHCI community of the current state of speech and natural language research, to dispel some of the myths surrounding speech-based interaction, as well as to provide an opportunity for researchers and practitioners to learn more about how speech recognition and speech synthesis work, what are their limitations, and how they could be used to enhance current interaction paradigms. Through this, we hope that HCI researchers and practitioners will learn how to combine recent advances in speech processing with user-centred principles in designing more usable and useful speech-based interactive systems.\",\"PeriodicalId\":120153,\"journal\":{\"name\":\"Proceedings of the 19th International Conference on Human-Computer Interaction with Mobile Devices and Services\",\"volume\":\"44 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 19th International Conference on Human-Computer Interaction with Mobile Devices and Services\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3098279.3119919\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 19th International Conference on Human-Computer Interaction with Mobile Devices and Services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3098279.3119919","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

长期以来，人机交互研究一直致力于更好、更自然地促进人与机器之间的信息传递。不幸的是，人类最自然的交流形式——语言，也是机器最难理解的形式之一——尽管，也许，因为它是我们拥有的最高带宽的交流渠道。虽然从工程学、语言学到认知科学的重大研究努力都花在了提高机器理解语音的能力上，但MobileHCI社区(以及整个HCI领域)在将这种模式作为研究的中心焦点方面一直相对胆怯。这可以部分归因于处理语音时错误率的意外变化，这与工业界经常毫无根据的成功声明形成对比，但也归因于设计，特别是评估语音和自然语言界面的内在困难。因此，基于语音的交互式系统的开发主要是由工程努力驱动的，目的是根据很大程度上任意的性能指标来改进此类系统。这样的开发通常缺乏任何以用户为中心的设计原则或对可用性或有用性的考虑。本课程的目标是向MobileHCI社区介绍语音和自然语言研究的现状，消除围绕基于语音的交互的一些神话，并为研究人员和从业者提供机会，了解更多关于语音识别和语音合成如何工作，它们的局限性是什么，以及如何使用它们来增强当前的交互范式。通过这项研究，我们希望HCI研究人员和实践者能够学习如何将语音处理的最新进展与以用户为中心的原则结合起来，设计出更可用和有用的基于语音的交互系统。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Speech and Hands-free interaction: myths, challenges, and opportunities

HCI research has for long been dedicated to better and more naturally facilitating information transfer between humans and machines. Unfortunately, humans' most natural form of communication, speech, is also one of the most difficult modalities to be understood by machines - despite, and perhaps, because it is the highest-bandwidth communication channel we possess. While significant research efforts, from engineering, to linguistic, and to cognitive sciences, have been spent on improving machines' ability to understand speech, the MobileHCI community (and the HCI field at large) has been relatively timid in embracing this modality as a central focus of research. This can be attributed in part to the unexpected variations in error rates when processing speech, in contrast with often-unfounded claims of success from industry, but also to the intrinsic difficulty of designing and especially evaluating speech and natural language interfaces. As such, the development of interactive speech-based systems is mostly driven by engineering efforts to improve such systems with respect to largely arbitrary performance metrics. Such developments have often been void of any user-centered design principles or consideration for usability or usefulness. The goal of this course is to inform the MobileHCI community of the current state of speech and natural language research, to dispel some of the myths surrounding speech-based interaction, as well as to provide an opportunity for researchers and practitioners to learn more about how speech recognition and speech synthesis work, what are their limitations, and how they could be used to enhance current interaction paradigms. Through this, we hope that HCI researchers and practitioners will learn how to combine recent advances in speech processing with user-centred principles in designing more usable and useful speech-based interactive systems.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 19th International Conference on Human-Computer Interaction with Mobile Devices and Services

自引率

0.00%

发文量