Establishing Human Observer Criterion in Evaluating Artificial Social Intelligence Agents in a Search and Rescue Task.

IF 2.9 2区 心理学 Q1 PSYCHOLOGY, EXPERIMENTAL
Topics in Cognitive Science Pub Date : 2025-04-01 Epub Date: 2023-04-13 DOI:10.1111/tops.12648
Lixiao Huang, Jared Freeman, Nancy J Cooke, Myke C Cohen, Xiaoyun Yin, Jeska Clark, Matt Wood, Verica Buchanan, Christopher Corral, Federico Scholcover, Anagha Mudigonda, Lovein Thomas, Aaron Teo, John Colonna-Romano
{"title":"Establishing Human Observer Criterion in Evaluating Artificial Social Intelligence Agents in a Search and Rescue Task.","authors":"Lixiao Huang, Jared Freeman, Nancy J Cooke, Myke C Cohen, Xiaoyun Yin, Jeska Clark, Matt Wood, Verica Buchanan, Christopher Corral, Federico Scholcover, Anagha Mudigonda, Lovein Thomas, Aaron Teo, John Colonna-Romano","doi":"10.1111/tops.12648","DOIUrl":null,"url":null,"abstract":"<p><p>Artificial social intelligence (ASI) agents have great potential to aid the success of individuals, human-human teams, and human-artificial intelligence teams. To develop helpful ASI agents, we created an urban search and rescue task environment in Minecraft to evaluate ASI agents' ability to infer participants' knowledge training conditions and predict participants' next victim type to be rescued. We evaluated ASI agents' capabilities in three ways: (a) comparison to ground truth-the actual knowledge training condition and participant actions; (b) comparison among different ASI agents; and (c) comparison to a human observer criterion, whose accuracy served as a reference point. The human observers and the ASI agents used video data and timestamped event messages from the testbed, respectively, to make inferences about the same participants and topic (knowledge training condition) and the same instances of participant actions (rescue of victims). Overall, ASI agents performed better than human observers in inferring knowledge training conditions and predicting actions. Refining the human criterion can guide the design and evaluation of ASI agents for complex task environments and team composition.</p>","PeriodicalId":47822,"journal":{"name":"Topics in Cognitive Science","volume":" ","pages":"349-373"},"PeriodicalIF":2.9000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Topics in Cognitive Science","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1111/tops.12648","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/4/13 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0

Abstract

Artificial social intelligence (ASI) agents have great potential to aid the success of individuals, human-human teams, and human-artificial intelligence teams. To develop helpful ASI agents, we created an urban search and rescue task environment in Minecraft to evaluate ASI agents' ability to infer participants' knowledge training conditions and predict participants' next victim type to be rescued. We evaluated ASI agents' capabilities in three ways: (a) comparison to ground truth-the actual knowledge training condition and participant actions; (b) comparison among different ASI agents; and (c) comparison to a human observer criterion, whose accuracy served as a reference point. The human observers and the ASI agents used video data and timestamped event messages from the testbed, respectively, to make inferences about the same participants and topic (knowledge training condition) and the same instances of participant actions (rescue of victims). Overall, ASI agents performed better than human observers in inferring knowledge training conditions and predicting actions. Refining the human criterion can guide the design and evaluation of ASI agents for complex task environments and team composition.

在搜救任务中评价人工社会智能主体的人类观察者准则的建立。
人工社会智能(ASI)代理在帮助个人、人与人之间的团队以及人与人工智能团队取得成功方面具有巨大的潜力。为了开发有用的ASI智能体,我们在《我的世界》中创建了一个城市搜索和救援任务环境,以评估ASI智能体推断参与者的知识培训条件和预测参与者下一个需要救援的受害者类型的能力。我们通过三种方式评估ASI智能体的能力:(a)与实际知识训练条件和参与者行为的比较;(b)不同ASI药剂之间的比较;(c)与人类观察者标准的比较,其精度作为参考点。人类观察者和ASI代理分别使用来自测试平台的视频数据和时间戳事件消息来推断相同的参与者和主题(知识训练条件)以及参与者行动的相同实例(救援受害者)。总体而言,ASI智能体在推断知识训练条件和预测行为方面比人类观察者表现得更好。精炼人类标准可以指导复杂任务环境和团队组成的ASI代理的设计和评估。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Topics in Cognitive Science
Topics in Cognitive Science PSYCHOLOGY, EXPERIMENTAL-
CiteScore
8.50
自引率
10.00%
发文量
52
期刊介绍: Topics in Cognitive Science (topiCS) is an innovative new journal that covers all areas of cognitive science including cognitive modeling, cognitive neuroscience, cognitive anthropology, and cognitive science and philosophy. topiCS aims to provide a forum for: -New communities of researchers- New controversies in established areas- Debates and commentaries- Reflections and integration The publication features multiple scholarly papers dedicated to a single topic. Some of these topics will appear together in one issue, but others may appear across several issues or develop into a regular feature. Controversies or debates started in one issue may be followed up by commentaries in a later issue, etc. However, the format and origin of the topics will vary greatly.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信