机器人应该展示它们听到的声音吗？误听是一种实际的成就。

IF 3 Q2 ROBOTICS

Frontiers in Robotics and AI Pub Date : 2025-09-26 eCollection Date: 2025-01-01 DOI:10.3389/frobt.2025.1597276

Damien Rudaz, Christian Licoppe

{"title":"机器人应该展示它们听到的声音吗？误听是一种实际的成就。","authors":"Damien Rudaz, Christian Licoppe","doi":"10.3389/frobt.2025.1597276","DOIUrl":null,"url":null,"abstract":"As a contribution to research on transparency and failures in human-robot interaction (HRI), our study investigates whether the informational ecology configured by publicly displaying a robot's automatic speech recognition (ASR) results is consequential in how miscommunications emerge and are dealt with. After a preliminary quantitative analysis of our participants' gaze behavior during an experiment where they interacted with a conversational robot, we rely on a micro-analytic approach to detail how the interpretation of this robot's conduct as inadequate was configured by what it displayed as having \"heard\" on its tablet. We investigate cases where an utterance or gesture by the robot was treated by participants as sequentially relevant only as long as they had not read the automatic speech recognition transcript but re-evaluated it as troublesome once they had read it. In doing so, we contribute to HRI by showing that systematically displaying an ASR transcript can play a crucial role in participants' interpretation of a co-constructed action (such as shaking hands with a robot) as having \"failed\". We demonstrate that \"mistakes\" and \"errors\" can be approached as practical accomplishments that emerge as such over the course of interaction rather than as social or technical phenomena pre-categorized by the researcher in reference to criteria exogenous to the activity being analyzed. In the end, while narrowing down on two video fragments, we find that this peculiar informational ecology did not merely impact how the robot was responded to. Instead, it modified the very definition of \"mutual understanding\" that was enacted and oriented to as relevant by the human participants in these fragments. Besides social robots, we caution that systematically providing such transcripts is a design decision not to be taken lightly; depending on the setting, it may have unintended consequences on interactions between humans and any form of conversational interface.","PeriodicalId":47597,"journal":{"name":"Frontiers in Robotics and AI","volume":"12 ","pages":"1597276"},"PeriodicalIF":3.0000,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12511783/pdf/","citationCount":"0","resultStr":"{\"title\":\"Should robots display what they hear? Mishearing as a practical accomplishment.\",\"authors\":\"Damien Rudaz, Christian Licoppe\",\"doi\":\"10.3389/frobt.2025.1597276\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As a contribution to research on transparency and failures in human-robot interaction (HRI), our study investigates whether the informational ecology configured by publicly displaying a robot's automatic speech recognition (ASR) results is consequential in how miscommunications emerge and are dealt with. After a preliminary quantitative analysis of our participants' gaze behavior during an experiment where they interacted with a conversational robot, we rely on a micro-analytic approach to detail how the interpretation of this robot's conduct as inadequate was configured by what it displayed as having \\\"heard\\\" on its tablet. We investigate cases where an utterance or gesture by the robot was treated by participants as sequentially relevant only as long as they had not read the automatic speech recognition transcript but re-evaluated it as troublesome once they had read it. In doing so, we contribute to HRI by showing that systematically displaying an ASR transcript can play a crucial role in participants' interpretation of a co-constructed action (such as shaking hands with a robot) as having \\\"failed\\\". We demonstrate that \\\"mistakes\\\" and \\\"errors\\\" can be approached as practical accomplishments that emerge as such over the course of interaction rather than as social or technical phenomena pre-categorized by the researcher in reference to criteria exogenous to the activity being analyzed. In the end, while narrowing down on two video fragments, we find that this peculiar informational ecology did not merely impact how the robot was responded to. Instead, it modified the very definition of \\\"mutual understanding\\\" that was enacted and oriented to as relevant by the human participants in these fragments. Besides social robots, we caution that systematically providing such transcripts is a design decision not to be taken lightly; depending on the setting, it may have unintended consequences on interactions between humans and any form of conversational interface.\",\"PeriodicalId\":47597,\"journal\":{\"name\":\"Frontiers in Robotics and AI\",\"volume\":\"12 \",\"pages\":\"1597276\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-09-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12511783/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in Robotics and AI\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3389/frobt.2025.1597276\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Robotics and AI","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/frobt.2025.1597276","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}

引用次数: 0

摘要

作为对人机交互（HRI）透明度和失败研究的贡献，我们的研究调查了通过公开显示机器人的自动语音识别（ASR）结果配置的信息生态是否对误解的出现和处理产生了重要影响。在对参与者与对话机器人互动的实验中，参与者的凝视行为进行了初步定量分析后，我们依靠微观分析方法来详细解释机器人的行为是如何通过其平板电脑上显示的“听到”来配置的。我们调查了这样的情况：只有当参与者没有阅读自动语音识别记录时，机器人的话语或手势才被视为顺序相关，但一旦他们阅读了它，就会重新评估它是麻烦的。在这样做的过程中，我们通过显示系统地显示ASR转录本可以在参与者将共同构建的动作（如与机器人握手）解释为“失败”方面发挥关键作用，从而为HRI做出贡献。我们证明，“错误”和“错误”可以被视为在互动过程中出现的实际成就，而不是研究人员根据所分析活动的外生标准预先分类的社会或技术现象。最后，在缩小两个视频片段的范围时，我们发现这种特殊的信息生态不仅影响了机器人的反应方式。相反，它修改了“相互理解”的定义，这个定义是由这些片段中的人类参与者制定和导向的。除了社交机器人，我们警告说，系统地提供这样的成绩单是一个不可掉以轻心的设计决定；根据设置的不同，它可能会对人类与任何形式的会话界面之间的交互产生意想不到的后果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Should robots display what they hear? Mishearing as a practical accomplishment.

查看原文本刊更多论文

Should robots display what they hear? Mishearing as a practical accomplishment.

As a contribution to research on transparency and failures in human-robot interaction (HRI), our study investigates whether the informational ecology configured by publicly displaying a robot's automatic speech recognition (ASR) results is consequential in how miscommunications emerge and are dealt with. After a preliminary quantitative analysis of our participants' gaze behavior during an experiment where they interacted with a conversational robot, we rely on a micro-analytic approach to detail how the interpretation of this robot's conduct as inadequate was configured by what it displayed as having "heard" on its tablet. We investigate cases where an utterance or gesture by the robot was treated by participants as sequentially relevant only as long as they had not read the automatic speech recognition transcript but re-evaluated it as troublesome once they had read it. In doing so, we contribute to HRI by showing that systematically displaying an ASR transcript can play a crucial role in participants' interpretation of a co-constructed action (such as shaking hands with a robot) as having "failed". We demonstrate that "mistakes" and "errors" can be approached as practical accomplishments that emerge as such over the course of interaction rather than as social or technical phenomena pre-categorized by the researcher in reference to criteria exogenous to the activity being analyzed. In the end, while narrowing down on two video fragments, we find that this peculiar informational ecology did not merely impact how the robot was responded to. Instead, it modified the very definition of "mutual understanding" that was enacted and oriented to as relevant by the human participants in these fragments. Besides social robots, we caution that systematically providing such transcripts is a design decision not to be taken lightly; depending on the setting, it may have unintended consequences on interactions between humans and any form of conversational interface.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Frontiers in Robotics and AI ROBOTICS-

CiteScore

6.50

自引率

5.90%

发文量

355

审稿时长

14 weeks

期刊介绍： Frontiers in Robotics and AI publishes rigorously peer-reviewed research covering all theory and applications of robotics, technology, and artificial intelligence, from biomedical to space robotics.