通过可解释的强化学习改善人机交互

Aaquib Tabrez, Bradley Hayes
{"title":"通过可解释的强化学习改善人机交互","authors":"Aaquib Tabrez, Bradley Hayes","doi":"10.1109/HRI.2019.8673198","DOIUrl":null,"url":null,"abstract":"Gathering the most informative data from humans without overloading them remains an active research area in AI, and is closely coupled with the problems of determining how and when information should be communicated to others [12]. Current decision support systems (DSS) are still overly simple and static, and cannot adapt to changing environments we expect to deploy in modern systems [3], [4], [9], [11]. They are intrinsically limited in their ability to explain rationale versus merely listing their future behaviors, limiting a human's understanding of the system [2], [7]. Most probabilistic assessments of a task are conveyed after the task/skill is attempted rather than before [10], [14], [16]. This limits failure recovery and danger avoidance mechanisms. Existing work on predicting failures relies on sensors to accurately detect explicitly annotated and learned failure modes [13]. As such, important non-obvious pieces of information for assessing appropriate trust and/or course-of-action (COA) evaluation in collaborative scenarios can go overlooked, while irrelevant information may instead be provided that increases clutter and mental workload. Understanding how AI models arrive at specific decisions is a key principle of trust [8]. Therefore, it is critically important to develop new strategies for anticipating, communicating, and explaining justifications and rationale for AI driven behaviors via contextually appropriate semantics.","PeriodicalId":6600,"journal":{"name":"2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI)","volume":"51 1","pages":"751-753"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"34","resultStr":"{\"title\":\"Improving Human-Robot Interaction Through Explainable Reinforcement Learning\",\"authors\":\"Aaquib Tabrez, Bradley Hayes\",\"doi\":\"10.1109/HRI.2019.8673198\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Gathering the most informative data from humans without overloading them remains an active research area in AI, and is closely coupled with the problems of determining how and when information should be communicated to others [12]. Current decision support systems (DSS) are still overly simple and static, and cannot adapt to changing environments we expect to deploy in modern systems [3], [4], [9], [11]. They are intrinsically limited in their ability to explain rationale versus merely listing their future behaviors, limiting a human's understanding of the system [2], [7]. Most probabilistic assessments of a task are conveyed after the task/skill is attempted rather than before [10], [14], [16]. This limits failure recovery and danger avoidance mechanisms. Existing work on predicting failures relies on sensors to accurately detect explicitly annotated and learned failure modes [13]. As such, important non-obvious pieces of information for assessing appropriate trust and/or course-of-action (COA) evaluation in collaborative scenarios can go overlooked, while irrelevant information may instead be provided that increases clutter and mental workload. Understanding how AI models arrive at specific decisions is a key principle of trust [8]. Therefore, it is critically important to develop new strategies for anticipating, communicating, and explaining justifications and rationale for AI driven behaviors via contextually appropriate semantics.\",\"PeriodicalId\":6600,\"journal\":{\"name\":\"2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI)\",\"volume\":\"51 1\",\"pages\":\"751-753\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-03-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"34\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HRI.2019.8673198\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HRI.2019.8673198","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 34

摘要

从人类那里收集最具信息量的数据而不使其超载仍然是人工智能的一个活跃研究领域,并且与确定如何以及何时将信息传达给他人的问题密切相关[12]。当前的决策支持系统(DSS)仍然过于简单和静态,无法适应我们期望在现代系统中部署的不断变化的环境[3],[4],[9],[11]。与仅仅列出未来的行为相比,它们解释基本原理的能力在本质上是有限的,这限制了人类对系统的理解[2],[7]。大多数对任务的概率评估是在任务/技能被尝试之后而不是之前进行的[10],[14],[16]。这限制了故障恢复和危险规避机制。现有的故障预测工作依赖于传感器来准确检测显式注释和学习的故障模式[13]。因此,用于评估协作场景中的适当信任和/或行动方案(COA)评估的重要的非明显信息片段可能会被忽略,而提供的不相关信息可能会增加混乱和心理工作量。理解人工智能模型如何做出具体决策是信任的一个关键原则[8]。因此,开发新的策略,通过上下文适当的语义来预测、交流和解释人工智能驱动行为的理由和基本原理,是至关重要的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Improving Human-Robot Interaction Through Explainable Reinforcement Learning
Gathering the most informative data from humans without overloading them remains an active research area in AI, and is closely coupled with the problems of determining how and when information should be communicated to others [12]. Current decision support systems (DSS) are still overly simple and static, and cannot adapt to changing environments we expect to deploy in modern systems [3], [4], [9], [11]. They are intrinsically limited in their ability to explain rationale versus merely listing their future behaviors, limiting a human's understanding of the system [2], [7]. Most probabilistic assessments of a task are conveyed after the task/skill is attempted rather than before [10], [14], [16]. This limits failure recovery and danger avoidance mechanisms. Existing work on predicting failures relies on sensors to accurately detect explicitly annotated and learned failure modes [13]. As such, important non-obvious pieces of information for assessing appropriate trust and/or course-of-action (COA) evaluation in collaborative scenarios can go overlooked, while irrelevant information may instead be provided that increases clutter and mental workload. Understanding how AI models arrive at specific decisions is a key principle of trust [8]. Therefore, it is critically important to develop new strategies for anticipating, communicating, and explaining justifications and rationale for AI driven behaviors via contextually appropriate semantics.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信